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In  this  report  three  problems  in  computer  communication  are 
considered  within  the  framework  of  non-classical  control  theory. 

First,  in  Chapter  2 we  deal  with  the  problem  of  sharing  one  commu- 
nication wire  among  a number  of  stations.  The  fact  that  all  communication 
stations  are  identical  and  that  they  share  one  objective  of  using  the 
communication  wire  as  efficiently  as  possible  leads  to  the  concept  of 
symmetric  team  problems.  Symmetric  solutions  to  symmetric  team 
problems  are  characterized  by  the  restriction  that  all  decision  makers 
must  have  identical  decision  rules.  In  the  second  section  the  access 
problem  in  multi-access  wire  communication  is  considered  as  a symmetric 
team  problem.  It  is  shown  that  the  symmetric  solution,  which  corresponds 
to  randomized  access  rules,  tends  to  give  as  good  performance  as  the  un- 
restricted when  the  number  of  stations  becomes  large. 

In  the  second  problem,  which  is  considered  in  Chapter  3,  stations 
communicate  through  a packet  switched  multi-access  satellite  channel. 

The  stations  can  only  share  (control-)  information  with  considerable 
delay:  the  round  trip  time  to  the  satellite.  A simple  model  is  developed 
for  which  it  can  be  shown  under  some  assumptions  that,  because  of  the 
delay  in  information,  the  optimal  decision  rule  can  be  an  open-loop  decision 
rule.  This  decision  rule  is  determined  separately  for  "new"  packets  and 
for  "collided"  packets. 

In  Chapter  4,  the  last  problem  is  considered.  The  communication 
medium  here  is  a distributed  packet  switched  computer  communication 
network.  The  problem  considered  deals  with  the  question  what  is  the 
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"bestM  information  to  base  decisions  on.  It  is  recognized  that  the 
statistical  parameters  which  describe  the  system,  themselves  are 
varying  in  time  in  a random  fashion.  This  leads  to  a cascade  of 
stochastic  processes  that  describe  the  most  essential  parameter:  the 
delay  or  travel  time  of  a packet  going  from  one  node  to  another. 
Different  classes  of  information  policies  correspond  to  the  different 
levels  of  the  cascade  at  which  the  delay  is  described.  It  is  analyzed 
what  is  the  best  class  of  information  for  routing  decisions  and  also 
other  design  choices,  which  affect  how  and  what  control  information 
is  exchanged  between  the  communication  computers  at  the  nodes  of 
the  network,  are  considered. 
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CHAPTER  1; INTRODUCTION. 

Control  theory  has  been  and  still  is  largely 
devoted  to  problems  in  which  there  is  one  controller,  alias 
decision  maker.  In  this  aspect  of  control  theory  the 
developments  are  now  very  advanced.  Control  theory  for 
systems  in  which  there  is  more  than  one  decision  maker  is 
relatively  underdeveloped,  but  this  aspect  — sometimes 
referred  to  as  non-classical  control  theory — is  recently 
enjoying  increased  interest  [1],  Many  practical  problems 
involve  large-scale  systems  which  — one  could  say:  by 

definition—  require  decentralized  control,  i.e.  there  is 
more  than  one  decision  maker  and  the  different  decision 
makers  have  not  all  identical  information  on  which  to  base 
their  decisions  [2],  The  control  problems  in  large-scale 
systems,  such  as  economic  and  social  systems,  are  of  great 
complexity  and  at  present  there  is  no  general  theory  to 
solve  those  problems.  Among  large-scale  systems,  computer 
communication  systems  are  relatively  simple  in  the  sense 
that  many  control  problems  in  computer  communication  can  be 
described  by  simple  models.  It  seems  therefore  that  joining 
non-classical  control  theory  and  problems  in  decentralized 
control  of  computer  communication  forms  a good  basis  for  new 
research.  This  report  builds  on  this  basis  and  shows  that 
joining  the  two  fields  is  mutually  beneficial:  in  one  way, 
the  frame  work  of  non-classical  control  theory  enables  us  to 
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formulate  a few  problems  in  computer  communication  clearly 
and  to  find  solutions,  in  the  other  way,  there  arise  in  the 
study  of  computer  communication  problems  concepts  that 
extend  the  framework  of  non-classical  control  theory. 

Computer  communication  means  communication  among 
computers  as  well  as  communication  by  computers.  The  most 
suited  technology  for  computer  communication  is  packet 
switching  as  opposed  to  circuit  switching  [3].  Circuit 
switching  is  the  technology  employed  in  the  telephone 
system;  here,  there  exists  a dedicated  path  from  end 
point  to  end  point  for  the  duration  of  the  connection.  With 
packet  switching  there  are  strings  of  binary  information 
(packets)  which  travel  from  one  end  point  of  the  (logical) 
connection  to  the  other,  but  the  actual  path  taken  by  a 
packet  need  not  to  be  the  same  for  each  packet.  In  fact  it 
is  decided  by  communication  computers  which  way  and  when  a 
packet  travels.  In  general  the  communication  computers  are 
geographically  distributed  and  can  only  share  information 
for  decision  making  through  the  same  communication  medium 
that  they  control.  The  problems  that  are  treated  in  this 
report  pertain  to  this  process  of  decentralized  decision 
making  by  communication  computers  in  a packet  switched 
environment.  Although  presently  packet  switching  is  used 
almost  exclusively  for  communication  among  computers,  this 
aspect  is  not  crucial;  most  of  the  results  also  apply  to 
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other  other  uses  of  packet  switching  such  as  telex 
communication  or  voice  communication. 

Decentralized  control  of  computer  communication, 
as  well  as  any  other  decentralized  control  problem,  is  not 
only  a question  of  what  should  the  controls  be  in  order  to 

I 

achieve  good  performance  with  respect  to  some  given 
criterion,  but  also  — and  maybe  most  importantly — a 
question  of  what  should  the  information  be  on  which  the 
controls  are  based.  The  importance  of  the  latter  question, 
which  can  be  stated  in  different  words:  what  is  the 
information  structure  of  a multi-person  decision  problem, 
was  pointed  out  in  [4]  and  later  made  more  explicit  in 
C 5 3 and  [6].  The  three  following  chapters,  in  which  three 
different  problems  in  computer  communication  are  considered, 
all  have,  in  one  way  or  the  other,  the  question  of 
information  as  a central  point. 

In  chapter  2 we  consider  some  aspects  of  the 
problem  of  sharing  one  communication  wire  among  a (possibly 
large)  number  of  communication  stations.  This  problem  is  a 
team  problem  because  the  stations  also  share  one  common 
objective,  namely,  to  use  the  communication  wire  as 
efficiently  as  possible.  The  problem  is  symmetric,  or 
invariant  under  a permutation  of  the  stations,  in  the  sense 
that  all  stations  are  identical.  For  example,  it  makes  no 
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difference  which  stations  have  at  a given  point  in  time  a 
packet  to  send,  but  only  the  number  of  stations  that  have  a 
packet  to  send  matters.  A symmetric  solution  to  a symmetric 
team  problem  is  defined  as  a solution  in  which  all  decision 
makers  (stations,  in  this  context)  must  have  identical 

t 

decision  rules.  In  terms  of  information  structure  this  has 
the  following  meaning.  Let  the  decision  makers  be  numbered 
1,2,...,N.  Then  a symmetric  solution  is  just  one  decision 
rule  (for  N stations)  which  is  not  a function  of  the 
information  of  what  number  is  assigned  to  a station. 

In  the  first  section  of  chapter  2 the  concepts  of  symmetric 
team  problems  and  symmetric  solutions  are  developed  and 
motivated.  In  the  second  section  these  concepts  are  applied 
to  multi-access  wire  communication  and  it  is  shown  how  the 
symmetric  solution  corresponds  to  randomized  access  rules. 

Also  it  is  shown  that  the  symmetric  solution  tends  to  give 
as  good  performance  as  the  unrestricted  solution  when  the 
number  of  stations  becomes  large. 

Chapter  3 deals  with  stations  that  communicate 
through  a satellite  channel.  The  stations  can  only  share  i 

(control-)information  with  considerable  delay:  the  round 

trip  time  to  the  satellite.  A simple  model  is  developed  for  4 

which  it  can  be  shown  that  a)  the  private  (i.e.  not  yet 
shared)  information  is  not  needed  for  the  optimal  decision 
rule,  and  b)  given  a few  reasonable  assumptions,  beyond  a 
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certain  relative  value  of  the  delay,  the  delayed  information 
is  also  not  needed  for  the  optimal  decision  rule.  These  two 
points  imply  that  the  optimal  decision  rule  can  be  an 
open-loop  decision  rule.  This  decision  rule  is  determined 
» for  both  the  case  that  packets  to  be  transmitted  are  ’’new” 

packets  and  the  case  that,  because  of  earlier  interference 
' of  two  or  more  stations,  ’’collided"  packets  need  to  be 

retransmitted . 

In  chapter  4 the  topology  of  the  communication 
medium  is  very  general,  namely  that  of  a distributed 
computer  communications  network,  through  which  packets  must 
be  routed  from  source  to  destination.  The  analysis  is 
correspondingly  broader  and  less  detailed.  The  perspective 
is  broader  in  the  specific  sense  that  the  statistical 
parameters  which  describe  the  system,  themselves  are 
recognized  to  be  varying  in  time  in  a random  fashion.  This 
leads  to  cascade  of  stochastic  processes  that  describe  the 
most  essential  parameter:  the  delay  or  travel  time  of  a 
packet  going  from  one  node  to  another.  The  cascade  goes  from 

4 

the  actual  value  of  the  delay,  which  is  a very  rapidly 
changing  quantity,  to  the  long  term  expected  delay,  which  is 

fr 

a constant.  Different  classes  of  information  policies 
correspond  to  the  different  levels  of  the  cascade  at  which 
the  delay  is  described.  We  will  analyse  what  is  the  best 
class  of  information  for  the  routing  decisions  and  consider 
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also  other  design  choices  which  affect  how  and  what  control 
information  is  exchanged  between  the  communication  computers 
at  the  nodes  of  the  network. 

Finally,  in  chapter  5 we  will  present  some 
concluding  remarks. 


Many  large  scale  system  problems  are  of  the  kind 
where  there  are  many  decision  makers  that  all  share  one 
common  objective.  Such  problems  fall  by  definition  in  the 
category  of  team  problems.  As  an  instance  of  this  kind  of 
problems  we  are  specifically  thinking  of  computer 
communication  problems,  where  there  are  many  computers  that 
must  each  decide  when  and  how  to  access  the  common 
communication  medium,  sharing  the  objective  of  using  the 
medium  as  efficiently  as  possible.  The  purpose  of  this 
section  is  to  investigate  the  case  where  all  decision  makers 
are  identical  and  to  examine  the  consequences  of  this 
symmetry.  Here  we  shall  restrict  ourselves  to  finite  team 
problems,  where  each  decision  maker  can  choose  only  from  a 
finite  number  of  possible  strategies.  This  restriction  is 
made  to  provide  a conceptually  simple  context,  in  which  the 
phenomena  that  arise  in  symmetric  team  problems  can  easily 
be  demonstrated. 

(2:1.1)  Dm f ini t ions.  A strategy  is  defined  as  a map  that 
assigns  a decision  to  each  possible  value  of  a decision 
maker's  information.  A (finite)  team  problem  in  normal  or 
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strategic  form  is  represented  by  a N-tensor  A(* 

— in  the  case  N=2  this  is  a matrix  A(*,*) — of  which  the 
element  A( j^ , J2, . . . , j»)  gives  the  cost  when  decision  maker  i 
chooses  strategy  i=1,...,N.  The  problem  is  to  find  for 

each  decision  maker  a probability  distribution  (mutually 
independent)  on  his  set  of  possible  strategies,  such  that 
the  expected  cost  with  respect  to  the  joint  probability 
distribution  is  minimized.  A particular  choice  of  such  a 
probability  distribution  is  called  a randomized  strategy.  In 
the  case  that  the  probability  distribution  is  degenerate, 
i.e.  the  decision  maker  will  pick  one  particular  element  of 
his  set  of  possible  strategies  with  probability  1,  we  call 
the  strategy  a pure  strategy.  So  the  space  of  randomized 
strategies  also  contains  pure  strategies  which  correspond  in 
a trivial  way  to  what  we  initially  called  strategies.  From 
now  on  * strategy'  will  implicitly  mean  'randomized 
strategy'.  * A set  of  N strategies  (one  for  each  decision 
maker)  is  called  a strategy-tuple.  A symmetric  team  problem 
is  a team  problem  for  which  the  cost  tensor  is  invariant 
under  any  permutation  of  its  indices.  For  a symmetric  team 
problem  a symmetric  strategy-tuple  is  defined  as  a 
strategy-tuple  which  has  all  strategies  identical.  A 
solution  is  a strategy-tuple  that  minimizes  the  expected 
cost.  A symmetric  solution. is  a strategy- tuple  that 

uniquely  minimizes  expected  cost.  The  uniqueness  means  that 
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the  implementation  of  a symmetric  solution  does  not  require 
a priori  agreement  among  the  team  members  on  more  than  the 
model  and  cost.  Note  that  with  these  definitions  ’solution' 
implies  optimality  and  'symmetric  solution'  implies 
optimality  and  uniqueness. 

(2:1.2)  IhNna.  For  a given  team  problem  there  is  always  a 
pure  strategy-tuple  for  which  the  minimum  cost  is  achieved. 
If  we  restrict  ourselves  to  symmetric  strategy-tuples,  the 
optimal  strategy  for  a symmetric  team  problem  is  not 
necessarily  a pure  strategy-tuple. 

Proof , Let  Oi ,)2» • • • »3n)  be  an  index  of  the  cost  tensor 
which  corresponds  to  the  smallest  value  of  the  tensor,  i.e. 
for  any  meaningful  index  ( Ji , J2, . . . , Jn)  the  following 
inequality  must  hold: 

(2:1.3)  A ( 3 1 ,32»  • • • »3n)  i A(  j-j , J2, . . . , Jn) 

Then  the  pure  strategy-tuple  in  which  decision  maker  i 
chooses  strategy  3i  with  probability  1 will  give  the  minimum 
cost,  because  any  other  strategy-tuple  will  result  in  a 
minimum  cost  that  is  a weighted  average  of  values  which 
appear  in  the  right-hand  side  of  inequality  (2:1.3).  To 
show  that  this  is  not  necessarily  the  case  in  symmetric  team 
problems  when  we  have  the  restriction  of  symmetric 
strategy-tuples,  we  shall  look  at  a specific  (counter) 
example.  Consider  the  symmetric  team  problem  with  Ns 2 and 
each  decision  maker  has  two  different  pure  strategies  with 
the  corresponding  cost  given  by  the  matrix 
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1 0 
0 a 

The  set  of  all  symmetric  randomized  strategy-tuples  is 
parametrized  by  a single  parameter  u (0iu£1),  representing 
the  probability  with  which  decision  maker  i will  choose 
strategy  1 (i=1,2).  The  probability  of  choosing  strategy  2 
is  then  automatically  1-u.  By  definition  the  two 
randomizations  are  done  independently.  The  expected  cost  for 
a randomized  symmetric  strategy-tuple  is  then  given  by 


p 

=(1+a)u  -2au+a. 

By  elementary  calculation  we  see  that  the  constrained 
minimum  will  be  at  u=0  for  a^.0  and  at  u=a/(a+1)  for  a>0. 
Thus,  for  positive  values  of  a,  the  optimal  strategy-tuple 
is  not  a pure  strategy-tuple. 


The  optimality  of  pure  strategy-tuples  in  team 
problems  is  in  fact  a straightforward  extension  of  a well 
known  result  in  Bayesian  decision  theory.  The  need  for 
randomized  strategy-tuples  under  the  restriction  of  symmetry 
is  a new  phenomenon.  To  demonstrate  the  usefulness  of  this 
imposed  restriction  we  shall  first  consider  an  example  of  a 
symmetric  team  problem  for  which  the  restricted  solution 
models  some  dynamics  we  may  encounter  in  everyday  live. 
This  suggests  that  symmetric  team  problems  do  exist  and  that 
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symmetric  solutions  are  implemented  in  practice.  After  the 
example  we  shall  discuss  some  other  desirable  properties  of 
randomized  symmetric  strategy-tuples.  Finally  in  section 
(2:2)  it  is  shown  how  one  may  impose  symmetry  in  problems 
that  are  not  necessarily  symmetric. 

(2:1.4)  Kuaple.  (Corridor  Problem)  Consider  a corridor  that 
is  conceptually  divided  into  3 lanes.  Walking  in  opposite 
directions  through  the  corridor  are  2 decision  makers 
(d.m.'s).  Initially  both  d.m.'s  are  in  the  same  lane,  say, 
the  middle  lane,  and  they  still  have  t=T  steps  to  go,  before 
they  will  either  pass  each  other  or,  if  they  can  not  decide 
on  using  different  lanes,  run  into  each  other.  The 
objective  is  to  find  a symmetric  strategy-tuple  that 

minimizes  the  probability  of  such  a collision.  To  describe 
the  problem  more  precisely,  define  the  state  when  there  are 
still  t steps  to  go,  x(t),  as  follows 

x(t)s"n"  if  both  d.m.'s  are  in  the  north  lane 

^,,m,’  if  both  d.m.'s  are  in  the  middle  lane 

* "s"  if  both  d.m.'s  are  in  the  south  lane 

*"o"  if  the  d.m.'s  are  in  different  lanes 

The  horizontal  movement  is  fixed  to  be  one  step  per  unit  of 

time  for  both  d.m.'s,  but  the  vertical  (i.e.  north-south) 

movement  depends  on  the  strategies  of  the  d.m.'s.  The 

possible  decisions  (i.e.  values  of  a strategy)  are  given  by 

ui(t)="u"  to  move  one  lane  up 
= "11"  to  stay  in  same  lane 
to  move  one  lane  down 


i=1,2 
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Of  course  the  strategies  ui(t)="u"  or  ui(t)=wd"  are 
excluded  when  respectively  x(t)="n"  or  x(t)=wsn.  The 
problem  ceases  to  be  a problem  as  soon  as  x(t)  becomes  "o"; 
it  is  assumed  that  then  both  d.m.'s  keep  their  lane. 


1 


■HI 

■■■I 


x(T)="m" 


steps  to  go  t=T  ...  2 1 0 1 2 

Figure  2:1. a.  Initial  state  and  possible 
Corridor  Problem. 


in  the 


The  dynamics  are  then  represented  by  the  following  two 
tables 

table  for  next  state  x(t-1)  table  for  next  state  x(t-1) 
when  u-|(t)  = U2(t)  when  u-|(t)*U2(t) 


x ( t ), 
= "n" 


when  u-|(t)  = U2(t) 
u="u"  u="h"  u="d" 

_ I . - 

• i i i 

! ! **n»  ! ! 


•s"  | 


X ( t— 1 )=no” 


A symmetric 

strategy-tuple 

specifies 

probability 

4 

distributions 

(u,h,3)  as  a 

function  of 

the  d.m.s' 

information.  In 

this  model  the  information  of  both  d.m.'s  is 

the  present  values  of  t and  x(t). 

| 

If  we  define 

the  cost  to 

be  1 when  x(0)rf"o"  and  to  be  0 when  x(0)=wo",  then  the 


expected  cost  for  a given  strategy-tuple  is  equal  to  the 


probability  of  the  two  d.m.'s  running  into  each  other.  The 
problem  is  to  find  the  symmetric  solution. 


Define  M(t)  as  the 


minimum  expected  cost  when  x(t)=wm"  and  define  S(t)  as  the 
minimum  expected  cost  when  x(.t)*nnn  or  "s"  (clearly  in  both 
states  the  cost  is  the  same  by  a simple  symmetry  argument), 
then  M(t)  and  S(t),  for  t=0,1,2,...  , are  given  by 


M (1+3Xi  )+^2  (1+3X2) 


where 


(2:1.7)  x1  = i-v^,  X-2=1+v£" 

and  the  optimal  strategy,  for  t=1,2,3 


is  represented 


( 0 ,M(t-1),S(t-1)) 

when  x(t)="n" 

(M(t-1),S(t-1),M(t-1)) 

when  x(t)="m" 

(S(t-1 ) ,M( t-1 ) , 0 ) 

when  x(t)="s" 


M(t-1)+S(t-1) 


2M(t-1 )+S(t-1 ) 


For  t oo  the  values  of  the  optimal  strategy  converge 


rapidly  to 
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(2:1.9) 


when  x(t)=Mn" 
when  x(t)="m" 
when  x(t)="s" 


-LwvF.i.o) 

i+<2 

Proof.  We  shall  prove  relations  (2:1.6)  and  (2:1.8)  by 
induction.  Suppose  (2:1.6)  is  true  for  a particular  value 
of  t.  When  there  are  t+1  steps  to  go,  and  both  d. in.'s  are  in 
the  middle  lane  (x(t+1)="m")  they  face  the  following 
problem:  they  have  to  find  the  probabilities  u,h,d  such  as 
to  minimize  the  expected  cost.  The  possible  outcomes  of  the 
two  independent  randomizations  are:  a.)  with  probability  h2 
both  d.m.'s  moved  horizontally  and  have  now  expected  cost 
M(t);  b.)  with  probability  u +d  both  d.m.'s  made  a step 
aslant  in  the  same  direction  and  have  now  expected  cost 
S(t);  c.)  with  probability  1-u2-h2-d2  they  came  in  different 
lanes  and  have  now  cost  0.  This  gives 

M(t+1 )=u+H+3=1 (h2M(t)+(u2+d2)S(t) } 

At  the  constrained  minimum  the  gradient  of  the  expression 
within  "{  }"  must  be  perpendicular  to  the  surface  u+h+d=1 

which  implies 


i.e. 


Art..)  = 4t..J  = 4t..} 

3u  dh  dd 


2uS(t)=2fiM( t)s23s(t) 

together  with  the  constraint  this  gives 


15 


Rs  3(t) 

2M(t)+S(t) 

u=d= 

2M( t)+S(t) 


at  the  minimum.  Here  we  can  already  conclude  that  assuming 
(2:1.6)  to  be  true  for  (a  particular  value  of)  t implies  the 
second  line  of  (2:1.8)  to  be  true  for  t+1.  Further,  the 
value  of  the  minimization  is 

M( t+ 1 )s  S2(t)M(t)+2H2(t)S(t)  . S(t)M(t) 

(2M(t)+S(t) )2  2M( t)+S( t) 

Referring  to  (2:1.6),  write  M(t)=2/a  and  S(t)=4/b  then 

M( t+1 )=  (2/a)(4/b)  _ 2 

4/a+4/b  a+b 

Now  substitute  the  appropriate  expressions  for  a and  b to 


M(fn=  1— 

XY  1 (2+5X1 )+X5_1(2+5X2) 

Using  (2:1.7)  one  can  easily  verify  that  2+5Xi  = Xi  ( 1+2X-| ) and 
that  2+5X2=X2(  1+2X2) . Therefore 

M( t+1  )=  2 


X^(1+2X1)+X^(i+2X2) 

which  is  the  first  equation  of  (2:1.6)  with  t+1  substituted 
for  t.  The  recursion  for  the  second  equation  of  (2:1.6)  is 
obtained  in  a completely  analogous  fashion,  by  considering 
the  decision  problem  when  both  d.m.'s  are  in  one  of  the 
side-lanes  ( x(  t+1  )s"s" , sa».  This  leads  to  the 


minimization 


S(t+! (u2M(t)+R2S(t) } 


I 


•»  . 


The  minimum  is  attained  at 


S(t) 


M(t)+S(t) 

fi.  -Hit) 

M(t)+S(t) 

which  shows  the  third  (and,  with  a trivial  substitution, 

also  the  first)  equation  of  (2:1.8),  and  further 

S(  t+1 )=  S(t)M(t) 

M(t)+S(t) 

Using  the  same  substitutions  as  above  we  get 

S(t+1  )=  _i_  = 

2a+b  XV*1 (3+7Xi )+v5_1(3+7X?) 


2a+b  xy-|(3+7X1)  + X^-1  (3+7X2) 

Using  (2:1.7)  one  can  easily  verify  that  3+7 Xi  = X-|  (1+3X-J ) and 
that  3+7X2=X2( 1+3X3) • Therefore 


S(t+1 )= 


xi(l+3x-j  )+x|(1+3X2) 


which  is  the  second  equation  of  (2:1.6)  with  t+1  substituted 
for  t.  To  complete  the  induction  argument  we  shall  verify 
(2:1.6)  for  t=0.  Using  the  fact  that  1 /Xi  + 1 A.  2=-2  and  that 
Xi  + X2=2  we  find 

M(0  )=  — , 1 — T = _£ — =1 

XT1(1+2X1)+X21(1+2X2)  -2+4 

S(0)=  = Jl — =1 

Xi,(1+3X1)+X2,(1+3X2)  -2+6 

To  prove  (2:1.9)  note  that  since  !x  1 S < 1 and  \2>1  we  have 

lim  S(t)  _ 4( I+2X2)  . 6+4 
^“mU)  " 2(  1+3X 2>  " 4+3 J2  " 

Thus 

H(t)  ^ 1 . M(t)  ^ 1 . S(t)  ^ J2 

M(t)+S(t)  ’ 2M( t)+S(t)  T77z  ’ H(t)+S(t)’  77 lz 
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. 


I 


which  gives  immediately  (2:1.9)  from  (2:1.8).  We  say  that 
the  convergence  is  rapid,  because  numerical  verification 
shows  that  already  for  t=3  u,  h and  d are  within  1J  of  their 
limit-values. 

(2:1.10)  Properties.  Symmetric  solutions  to  symmetric  team 
problems  generally  have  the  following  properties 

• realism 

• simplicity 

• fairness 

• robustness 

• no  cost  of  convention 
These  four  points  are  elaborated  below 

• A simple  simulation  program  that  makes  the  d.m.’s  move  on 
a video-screen  according  to  the  solution  just  derived,  shows 
a very  familiar  scene.  For  example  we  may  see  a pattern  as 
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Figure  2:1.b.  Sample  pattern  of  the  solution  of  the  Corridor 
Problem,  resulting  from  a randomized  symmetric 
strategy-tuple. 
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in  fig.2:1.b.  This  suggests  that  the  concept  of  symmetry  in 
team  problems,  as  described  in  this  section,  is  realistic  in 
the  sense  that  it  models  how,  in  some  team  problems  that 
occur  in  everyday  life,  the  asymmetry  that  is  needed  for  a 
solution  is  introduced  through  randomization.  Here  the 
practical  meaning  of  the  mathematical  concept  of  randomized 
decisions  is  that  the  decisions  depend  on  factors  too 
complex  to  be  included  in  the  model. 

• Decision  problems  with  more  then  one  decision  maker 
involved  are,  with  our  present  knowledge  about  such 
problems,  often  hard  (if  not  impossible)  to  solve.  This  is 
the  case  when  the  number  of  decision  makers  N=2,  but  even 
more  so  for  large  scale  problems  where  N>>1.  When  such 
problems  are  symmetric  team  problems  then  the  restriction  to 
symmetric  strategy-tuples  can  introduce  great 
simplification.  Under  such  restriction  we  need  to  find  only 
a single  strategy  that  minimizes  cost.  Although  the  search 
is  among  randomized  strategies,  it  is  a better  understood 
and  usually  simpler  problem  then  to  find  a tuple  of  N 
possibly  different  strategies.  This  point  is  illustrated  in 
section  (2:2)  where  we  consider  the  problem  of  access 
control  of  N stations  to  a common  communication  wire. 

• A solution  to  a symmetric  team  problem  may  require  that 
one  (or  a few)  of  the  decision  makers  makes  an  exceptional 
decision.  Generally  one  would  consider  it  as  fair  if  it  is 
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determined  by  a "lottery”  who  of  the  decision  makers  will  be 
the  exception. 

• When  the  team  members  (decision  makers)  adopt  randomized 
strategies  that  are  not  degenerate,  the  implication  is  that 
each  decision  maker  does  not  count  on  specific  strategies  of 
his  team  members,  but  is  prepared  to  cooperate  with  a whole 
range  of  possible  strategies  of  his  team  members.  As  such, 
the  symmetric  solution  is  robust  because  it  is  not  likely  to 
break  down  when  one  of  the  team  members  makes  an  error,  i.e. 
accidentally  does  not  follow  the  strategy  that  was  counted 
on.  For  example  in  the  case  of  the  Corridor  Problem,  when 
one  team  member  insists  on  keeping  the  left  lane  then  still 
a collision  will  most  likely  be  avoided  through  the 
randomizations  of  the  other  team  member  (assuming  T not  too 
small).  However,  suppose  we  also  allowed  asymmetric 
strategies  in  the  Corridor  Problem,  then  the  strategy-tuple 
where  both  team  members  keep  their  right  lane*)  achieves  the 
minimum  cost  but  it  would  not  have  the  flexibility  of 
avoiding  a collision  when  one  of  the  team  members  made  the 
error  of  insisting  on  keeping  the  left  lane. 

• For  a given  symmetric  team  problem  one  can  always  get  at 
least  as  low  cost  as  the  symmetric  solution  by  allowing 

•)  Note  that  the  strategy-tuple  "both  d.m.'s  keep  right"  is 
not  a symmetric  solution,  because  it  requires  a priori 
agreement  between  the  d.m.'s  that  not  the  equally  good 
strategy-tuple  "both  d.m.'s  keep  left"  will  be  used  (cf. 
definition  (2:1.1)). 
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asymmetric  solutions.  However  it  is  not  reflected  in  the 
cost  function  that  asymmetric  solutions  in  fact  also  bear 
the  cost  of  establishing  a convention  that  tells  which 
asymmetric  solution  is  to  be  chosen.  Taking  the  latter  cost 
in  account  one  could  find  that  sometimes  the  symmetric 
solution  is  cheaper  then  the  unrestricted  solution. 
Consider  again  the  Corridor  Problem.  As  long  as  we  talk 
about  pedestrian  traffic,  the  expected  cost  of  collision  is 
so  low  that  it  is  not  worthwhile  to  have  a widely 
established  convention.  In  the  case  of  automobile  traffic, 
however,  the  cost  of  collision  is  so  high  that  establishing 
a convention  certainly  pays  off. 

(2:1.11)  Definition.  A symmetric  solution  to  a symmetric 
team  problem  is  said  to  be  asymptotically  optimal  if  the 
difference  in  cost  between  the  restricted  (symmetric) 
solution  and  the  unrestricted  (possibly  asymmetric)  solution 
0 when  the  number  of  time  steps  in  which  the  problem  must 
be  solved  T oo  or  the  number  of  decision  makers  N oo.  In 
the  latter  case  it  is  assumed  that  the  cost  is  defined  for 
all  values  of  N. 

(2:1.12)  Corollary.  The  solution  of  the  Corridor  Problem  is 
asymptotically  optimal. 

Proof.  Obviously  the  asymmetric  solution  (u,h,d)=(1 ,0,0)  for 
d.m.1  and  (u,h,d)=(0,0, 1 ) for  d.m.2  at  t=T,  yields  the 


minimum  cost  possible:  0.  The  optimal  symmetric  solution 

yields  cost  M(T ) . From  (2:1. 6-7)  it  is  clear  that 
M(T)  = 0.  Q.E.D. 

(2.1:13)  leaark.  It  would  be  valuable  to  be  able  to  tell 
directly  from  the  cost  function  of  a symmetric  team  problem 
whether  the  symmetric  solution  is  asymptotically  optimal  or 
not,  without  actually  finding  the  minimum  cost  of  the 
symmetric  and  the  unrestricted  solution.  The  next  theorem  is 
a first  attempt  in  this  direction.  It  gives  sufficient 
conditions,  in  terms  of  the  cost  function,  for  asymptotic 
optimality  when  the  number  of  decisions  makers  N ■»  oo.  The 
key  point  in  the  proof  is  the  use  of  a limiting  result  (the 
strong  law  of  large  numbers).  The  sufficient  conditions  are 
stronger  than  necessary,  because  the  cost  for  the  access 
problem  in  the  next  section  does  not  satisfy  the  condition 
but  nevertheless  asymptotic  optimality  is  proved  in 
(2:2.19).  There  is  a strong  connection  between  the  next 
theorem  and  theorem  (2:2.19)  in  the  sense  that  the  proof  of 
(2:2.19)  also  relies  on  a limiting  result  (the  convergence 
of  the  hypergeometric  distribution  to  the  binomial 
distribution).  The  difference  is  that  in  (2:2.19)  we  do  not 
consider  the  number  of  decision  makers  that  choose  a certain 
strategy,  but  the  number  of  decision  makers  that  makes  a 
certain  decision. 
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(2:1.1%)  Theorem.  The  cost  of  a given  symmetric  team  problem 
can  always  be  written  as  C(n) , where  the  value  of  the  j-th 
component  of  the  vector  n gives  the  number  of  decision 
makers  that  chose  strategy  j,  j=1,...K  Unumber  of 
alternative  strategies  for  each  decision  maker).  Suppose 
that  the  symmetric  team  problem  is  defined  for  any  total 
number  of  decision  makers  N and  that 


C(n)=C(Xfl) 

for  any  n and  X such  that  ja  and  Xjj  are  (non-negative) 

integer  vectors,  i.e.  the  cost  depends  only  on  the  ratio  of 

numbers  of  decision  makers  choosing  the  different 

strategies.  Suppose  further  that  the  obvious  extension  of 

C(*)  to  the  space  of  vectors  with  non-negative  rational 

components,  which  is  defined  by  C(fl)=C(m<a)  with  qj=wj/vj 

(sratio  of  non-negative  integers)  and  m=  IIvi,  is 

J=1  J 

continuous.  Then  the  solution  to  the  symmetric  team  problem 


Proof.  The  cost  tensor  which  gives  the  cost  for  any 
strategy-tuple  is  by  definition  invariant  under  a 
permutation  of  indices.  Therefore  the  cost  depends  not  on 
who  the  decision  makers  are  that  choose  the  particular 


strategies  but  only  on  the  number  of  decision  makers  that 


choose  each  strategy,  which  implies  that  the  cost  can  be 
written  in  the  form  C(n).  Since  C(*)  is  continuous  on  the 
space  of  positive  rational  vectors  it  can  easily  be  extended 
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to  a continuous  function  on  the  space  of  non-negative  real 

vectors  by  C(r)  = ^ C(fl(l) ) where  U(l))®i  is  a sequence  of 

non-negative  rational  vectors  that  converges  to  £.  Because 
K 

R1  = fr!  jZ^rjsl,  rj2.0)  is  a compact  set  and  because 

C(r)=C(X£) , there  exists  a £6Ri  such  that  for  any 

non-negative  vector  £:  C(j;)2C(r).  Now  consider  the 

randomized  symmetric  strategy  where  each  decision  maker 

chooses  with  probability  rj,  the  j-th  strategy  out  of  the 

finite  set  of  possible  strategies.  Let  the  random  vector  m 

have  all  elements  =0  except  for  the  J-th  element  which  is 

=1;  j is  determined  by  the  randomization  of  decision  maker  i 

as  the  index  of  the  strategy  he  chooses.  Then  EtjijJsfc. 

N 

According  to  the  definition  of  ja  above  we  have  £=  X and 
the  expected  cost  is  E { C (a) }=E{C( ( 1 /N)n) } . From  Kolmogorov's 
strong  law  of  large  numbers  (cf.  [7],  p.124)  we  know  that 
for  any  e>0  there  is  a Ne  such  that  for  N>Ng 

Pr{  j (1/N)ii-£i<0}>1-e.  Together  with  the  continuity  of  C(  • ) 
this  implies  that  for  N large  enough,  the  expected  cost  of 
this  randomized  strategy-tuple  will  be  arbitrarily  close  to 
C(£).  Of  course  for  every  value  of  N,  the  expected  cost  of 
the  optimal  symmetric  strategy-tuple  and  the  cost  of  the 
optimal  unrestricted  strategy-tuple  will  lie  between  the 
expected  cost  of  the  randomized  strategy-tuple  under 

consideration  and  C(£).  It  follows  that  the  symmetric 
solution  must  be  asymptotically  optimal. 


L 

..AMgaa  -Prcblea  la  Wir.e.  C<?mnmni<?3Upn. 

The  concept  of  many  devices  sharing  one 
communication  wire  (data-bus)  is  already  well  established  in 
computer  systems.  With  micro-processors  becoming  widely 
available  at  very  low  cost,  the  usefulness  of  this  concept 
is  greatly  enhanced  and  it  finds  ever  increasing 
application.  The  devices  that  share  the  communication  wire 
are  not  limited  anymore  to  the  traditional  elements  of  a 
computer  system.  For  example  we  may  now  find  systems  that 
consist  of  a large  number  of  measurement  devices  which 
periodically  send  data  on  water-  and  air  quality  to  data 
processing  units  and  which  periodically  receive  instructions 
on  what  and  how  to  measure  over  the  same  communication  wire. 
Or,  we  may  find  a system  consisting  of  sensors,  actuators 
and  minicomputers  that  control  some  production  process  and 
share  one  communication  wire.  Or  we  may  find  a set  of 
computer  systems  that  are  connected  by  a high  speed  data-bus 
to  provide  the  possibility  of  distributed  computing.  These 
examples  form  a tiny  fraction  of  all  possible  applications 
of  multi-access  wire  communication.  The  major  advantages  of 


one  shared  wire  over  say  a network  of  connections  between 
every  pair  of  devices  which  may  need  to  communicate  among 
each  other,  are  efficiency  and  simplicity.  Sharing  is 
efficient,  because  the  devices  with  bursty  communication 
traffic  can  now  alternate  in  the  use  of  the  communication 
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wire.  Simplicity  and  flexibility  are  inherent  to  the 
concept  of  one  wire  — one  can  add  and  take  away  devices  or 
establish  and  eliminate  logical  communication  links  between 
pairs  of  devices  without  altering  the  simple  topology  of  the 
system. 

The  purpose  of  this  section  is  to  show  how  the 
framework  of  non-classical  control  theory,  in  particular  the 
new  concept  of  symmetric  team  problems,  gives  insight  to  the 
different  possible  solutions  to  the  access  problem  in 
multi-access  wire  communication.  Loosely  speaking,  the 
access  problem  is  the  problem  of  determining  efficiently, 
and  in  a decentralized  way,  which  device  will  be  next  to  put 
a signal  on  the  communication  wire  after  the  wire  becomes 
available  again.  It  is  not  the  purpose  of  this  section  to 
give  a complete  solution  to  all  control  problems  in 
multi-access  wire  communication.  In  the  description  below,  a 
few  control  aspects  of  multi-access  wire  communication  will 
be  mentioned,  and  it  is  indicated  why  the  access  problem  is 
relatively  the  main  control  problem. 

(2:2.1)  Dmsarlptlon.  A device  that  is  attached  to  the 
communication  wire  and  which  has  a decision  making  process, 
which  controls  when  the  device  will  transmit  on  the  wire, 
will  be  called  a station.  The  stations  can  transmit 
information  through  the  wire  in  the  form  of  packets.  A 


packet  is  a string  of  bits  that  is  composed  of  a header 
which  contains  the  destination  address  of  the  packet  and 
possibly  other  control  information,  a body  which  contains 
arbitrary  binary  information,  and  a tail  which  contains  a 
checksum  to  verify  correct  transmission.  Packets  can  have 
any  length,  but  it  is  assumed  that  their  length  in  bits 
divided  by  the  bit-rate  of  the  wire  is  much  larger  than  the 
propagation  delay  between  the  far  ends  of  the  wire.  He  say 
that  a station  needs  access  to  the  wire  if  there  are  packets 
queued  at  that  station  to  be  transmitted  on  the  wire.  If  a 
station  does  not  need  access  then  that  means  that  the  queue 
is  empty,  and  the  queue  is  assumed  to  be  of  infinite 
capacity.  The  controls  that  each  station  has  are  like  a 
traffic  light  (without  yellow).  A station  transmits  a packet 
on  the  wire  (i.e.  attempts  to  access  the  wire)  if  it  has 
green  light  and  a packet  in  its  queue,  otherwise  the  station 
is  silent.  The  history  of  the  wire  consists  of  alternating 


intervals  of 


access  the  wire), 


(different  stations  attempt  to 
.tion  (one  station  has  sole  access 


to  the  wire)  and  silence  (no  station  needs  access).  We 
shall  assume  that  eventually  every  packet  that  becomes 
queued  at  a station  will  be  successfully  transmitted.  The 
ob  lectlve  is  to  minimize  the  total  delay.  Consider  a period 
that  starts  at  the  end  of  an  interval  of  silence  and  that 


ends  at  the  start  of  an  interval  of  silence.  The  total  delay 
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in  that  period  would  be  certainly  minimized  if  this  period 
was  just  a succession  of  intervals  of  acquisition.  In  that 
case  the  total  delay  in  that  period  corresponds  to  the 


Figure  2:2. a.  The  shaded  area  gives  a lower  bound  for  the 
total  delay  incurred  in  a period  between  two  intervals  of 
silence.  The  area  represents  the  total  number  of  bit-seconds 
waiting  for  this  particular  sample  of  arrivals,  when  the 
wire  is  used  at  100}  of  its  capacity  during  this  period. 

shaded  area  in  the  graph  given  in  fig.  2:2. a.  However, 
given  the  fact  that  the  packets  to  be  sent  are  distributed 
over  different  stations  and  that  we  must  determine  in  a 
decentralized  way  which  station  succeeds  another  station  in 
acquisition  of  the  wire,  there  will  be  intervals  of 
contention  during  which  no  useful  bits  are  transmitted, 
thereby  increasing  the  total  delay.  See  fig.  2:2. b (for  the 
sake  of  exposition  it  is  assumed  that  the  period  under 
consideration  does  not  run  into  the  next  period).  The 
actual  delay  can  be  brought  as  close  as  possible  to  its 
lower  bound  firstly  by  minimizing  the  number  of  intervals  of 
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Figure  2:2. b.  The  actual  delay  will  be  larger  than  the  lower 
bound  given  in  fig.  2:2. a,  because  in  practise  there  will  be 
intervals  of  contention  during  which  it  will  be  determined 
which  station  is  next  in  acquisition  of  the  wire  and  no 
useful  bits  are  transmitted  then. 

contention  in  a given  period  and  secondly  by  minimizing  the 
length  of  each  period.  To  achieve  the  first  point  one 
obvious  control  is  that  each  station  will  not  transmit  a 
packet  on  the  wire  as  long  as  it  "hears"  that  another 
station  has  not  finished  it's  transmission.  Apart  from  this 
control,  which  is  known  as  "carrier  sense"  in  packet  radio 
applications  [8],  the  number  of  intervals  is  predominantly 
determined  by  how  many  stations  need  access,  which  is  mostly 
determined  by  outside  events.  So  the  main  control  problem 
relates  to  second  objective:  minimize  the  length  of  a 
contention  interval;  the  rest  of  this  section  deals  with 
this  access  problem. 

The  access  problem  can  arise  in  two  different  ways.  One:  if 
during  an  interval  of  acquisition  by  one  station  at  least 
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two  other  stations  get  a packet  in  their  queue,  then  it  is 
assumed  that  right  after  the  present  station  gives  up  its 
acquisition  all  the  stations  which  need  access  to  the  wire 
will  start  transmitting  and  will  "hear"  within  one 
end-to-end  delay  time  that  there  is  a collision  of  packets. 
Two:  if  during  an  interval  of  silence  two  or  more  stations 
get  a packet  within  one  end-to-end  delay  time,  in  the  same 
way  these  stations  will  transmit  their  packets  concurrently 
and  detect  that  there  is  a collision  of  packets.  In  both 
cases  the  stations  involved  are  approximately  synchronized 
(to  within  one  end-to-end  delay  time).  In  the  subsequent 
analysis  we  shall  assume  that  the  stations  are  perfectly 
synchronized.  This  assumption  is  justified  if  the  time 
interval  between  subsequent  decisions  of  a station  (2  one 
time  slot)  is  constant  and  not  less  than  three  end-to-end 
delay  times  and  if  the  stations  transmit  for  exactly  the 
first  two-thirds  of  a time  slot  when  there  is  a time  slot  of 
interference. 

At  this  point  we  have  the  access  problem 
sufficiently  introduced  and  isolated  to  be  able  to  represent 
it  by  a simple  mathematical  model. 

(2:2.2)  Model.  The  access  problem  arises  whenever  in  some 
time  slot,  which  we  shall  label  as  time  slot  0,  no  station 
has  acquisition  of  the  wire  but  no22  stations  need  access  to 
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the  wire  out  of  a total  population  of  No  stations.  The  no 

stations  will  generate  a collision  in  that  time  slot, 

thereby  making  the  initial  conditions  known  that  there  are 

No  stations  out  of  which  no  need  access  to  the  wire.  The 

variables  describe  which  are  those  no: 

x^sl  if  station  i needs  access 

=0  if  station  i doesn't  need  access 

and  any  subset  of  no  out  of  Nq  with  0q22  is  equally  likely. 

Each  station  has  to  make  at  times  t=1,2,...  (where  time  t 

marks  the  beginning  of  time  slot  t)  a binary  decision  Ui(t): 

Ui(t)=1  station  i has  green  light  during  time  slot  t 

(has  no  effect  when  xi=0) 

=0  station  i has  not  green  light  during  time  slot  t 

The  state  of  the  wire  is  represented  by  the  variable  r(t), 

which  has  as  its  value  the  number  of  stations  that  attempted 

to  access  the  wire  in  time  slot  t: 

N 

r(  tis^xiUj^  t) 

(2:2.3)  Problaa.  The  problem  is  to  find  controls 
(sdecisions)  that  minimize  T.  the  duration  of  the  interval 
of  contention.  T is  given  by: 

Ts  min  It!  ui(t+1 )xi=1  for  exactly  one  ie { 1 , . . . ,No) } • 
However  the  solution  to  this  problem  depends  on  what  the 
information  is  on  which  the  controls  will  be  based.  This 
will  be  specified  in  paragraph  (2:2.5). 

(2:2.4)  Defialtioa.  A control  law  is  defined  as  a map  from 
the  information  of  a station  to  the  controls  by  that 
station. 
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(2:2.5)  Model.  (Information  structure).  The  nature  of  the 
problem  is  such  that  the  decisions  have  to  be  made  in  a 
decentralized  way.  Thus  there  is  not  a central  controller 
who  knows  the  vector  i,  in  which  case  the  minimum  cost  would 
obviously  be  0,  because  the  central  controller  could  Just 
pick  one  of  the  ng  stations  with  x^=1.  Decentralization 
implies  that  the  only  information  on  the  initial  conditions 
available  to  station  i is  the  value  of  Xi  (and  implicitly  Ng 
and  ng).  However  if  xi=0  then  the  cost  is  independent  of 
whatever  controls  are  chosen.  Therefore  we  can  take  the 
controls  in  the  case  x^rO  the  same  as  in  the  case  xj,s1, 
which  leads  to  the  important  conclusion  that  the  control  law 
of  station  i will  not  be  a function  of  £.  With  regard  to 
further  possible  information  we  shall  consider  three 
different  cases  of  information  structures  and  thus  create 
three  different  cases  of  problem  (2:2.3)  which  we  want  to 
compare. . The  first  case  (case  (u))  has  the  richest 
information  structure  and  the  corresponding  solution  is  an 
unrestricted  solution  in  the  sense  of  section  (2:1)  in 
comparison  with  the  second  case.  The  second  case  (case  (s)) 
differs  from  the  first  case  only  in  the  restriction  that  the 
corresponding  solution  must  be  a symmetric  solution  (cf. 
definition  (2:1.1)).  In  terms  of  the  information  structure 
this  means  the  control  law  can  not  be  a function  of  the 
number  assigned  to  a station,  assuming  we  have  numbered  our 
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stations  In  both  cases,  as  the  system  evolves  in 

time,  each  station  will  record  the  state  history  of  the 
wire,  on  which  future  controls  can  be  based;  more 
specifically  it  is  assumed  that  at  time  t each  station  knows 

of  previous  time  slots  how  many  of  the  ng  stations  that  need 

access  to  the  wire  attempted  to  do  so  and  whether  it  was  one 

of  those  (i.e.  the  station  also  knows  its  control  history). 

In  the  third  case  (case  (m))  only  the  minimum  information 
which  is  necessary  to  implement  a control  law  is  available. 
This  is  only  one  bit  of  information  which  indicates  whether 
one  station  acquired  the  wire  (i.e.  the  problem  is  over)  or 
not  (i.e.  we  still  have  the  initial  problem). 

In  summary  we  have  in  each  case  at  time  t the  following 
information  on  which  the  controls  can  be  based: 

case  (u)  i :the  number 

assigned  to  the  station 

r( t-1 ) ,r( t-2) , . . . ,r( 1 ) rhistory  of 

number  of  stations  with  Xi=1 
that  attempted  to  access  the  wire 

ui( t-1 ) ,Ui( t-2) , . . . ,ui( 1 ) : stations’ s 

control  history 

case  (s)  r( t-1 ) ,r( t-2) , . . . ,r( 1 ) :history  of 

number  of  stations  with  x^rl 
that  attempted  to  access  the  wire 

Ui(t-1 ) ,Ui(t-2) , . . . ,uj[(  1 ) : stations’  s 

control  history 

case  (m)  r(t-1)=1  whether  1 station 

(true  or  false)  acquired  the  wire 
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in  both  cases  the  initial  information  Nq,  no  is  implicitly 
assumed  to  be  available.  Note  that  if  we  consider  the 
control  laws  which  are  in  effect  at  the  different  stations 
to  be  available  information,  then  in  case  (u)  the  control 
history  becomes  redundant  information  because  it  can  always 
be  reconstructed.  In  case  (s),  as  we  shall  see  in  solution 
(2:2.10),  the  controls  will  be  made  dependent  on  random 
events  (randomized  decisions),  in  that  case  the  control 
history  also  becomes  redundant  information  when  each  station 
records  the  outcomes  of  its  randomizer. 

(2:2.6)  l«urk.  It  is  somewhat  unusual  to  assume  the 
information  r(t)  to  be  available,  customarily  one  would 
assume  only  a reduction  R of  r(t)  to  be  available,  given  by 
R(0)="silence" , R( 1 )="successful  access"  and 
R(r)  = "collision"  for  r^.2.  The 'reason  for  our  assumption  is 
that  it  is  easier  to  deal  with  a value  r(t)  than  with  a 
probability  distribution  over  the  possible  values  1,...,no, 
which  would  otherwise  be  the  case.  Further  in  defense  of 
this  assumption  we  want  to  mention  that  it  is  not 
inconceivable  that  each  station  has  a hardware  device  which 
can  measure  r(t)  from  the  intensity  of  the  signal  on  the 
wire.  In  the  same  way  we  can  defend  the  assumption  that  a 
station  knows  ng,  the  number  of  stations  that  need  access  to 
the  wire  initially.  That  is,  the  access  problem  will 
presumably  arise  only  right  after  a collision  has  occurred 
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among  the  stations  that  need  access  to  the  wire.  From  this 
initial  collision  the  stations  will  know  the  value  of  ng. 


(2:2.7)  Solution.  (Case  (u)).  Given  the  minor  restriction 
that  at  each  point  in  time  the  set  of  stations  that  are 
considered  for  possibly  having  green  light  form  a subset  of 
the  original  Nq  stations  out  of  which  a Jtn&wa  number  number 
need  access,  then  the  minimum  expected  length  of  the 
interval  of  contention  is 


,no ) 

with  T(*,*)  given  by  the  recursive  relation: 

(2:2.8) 

T(N,n)=m^n{^ZoH(r!N,n,u)(L(r)+min{T(u,r) ,T(N-u,n-r) } ) } 
where 

L( r) =0  if  r= 1 

=1  otherwise 


and 


H(r|N,n,u)= 


(8)(S:P) 

<!) 


(Hypergeometric 

distribution) 


the  "boundary  conditions"  are  given  by 


T(N , 0 ) = oo  vn 
T(N , 1 ) = 0 VN 

The  optimal  controls  are  also  given  by  (2:2.8):  if  at  the 
beginning  of  time  slot  t there  are  N stations  considered  out 
of  which  n need  access,  then  u,  the  argument  of  the 
minimization  in  (2:2.8)  gives  the  optimal  number  of  stations 
in  the  set  under  consideration  to  have  green  light  for  time 
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slot  t.  Let  the  number  of  stations  that  transmit  a packet  In 
time  slot  t be  r,  then  the  stations  to  be  considered  for 
green  light  in  the  next  time  slot  are  the  u stations  that 
just  had  green  light  (out  of  which  r need  access)  if 
T(u,r)^T(N-u,n-r)  otherwise  it  is  the  set  of  N-u  stations 
that  did  not  have  green  light  (out  of  which  n-r  need 
access) . 

Proof.  Suppose  that  the  minimum  expected  cost  for  problem 
(2:2.4)  with  initial  conditions  N,  n is  given  by  T(N,n). 
From  the  point  of  minimizing  cost  it  is  only  the  total 
number  of  stations  with  green  light  — let's  denote  this 
total  by  u—  that  matters.  The  latter  is  explained  by  the 
fact  that  the  control  laws  will  not  be  functions  of  the 
values  x^  and  that  further  the  only  difference  between  the 
stations  is  the  number  that  is  assigned  to  them.  This 
difference  is  useful,  even  necessary,  to  design  a set  of 
control  laws  (one  for  each  station)  which  generate  controls 
such  that  the  total  number  of  stations  with  green  light  is 
u,  but  the  effect  of  the  controls  is  independent  of  whatever 
numbers  have  been  assigned  to  the  stations  with  green  light. 
The  result  of  having  a total  of  u stations  with  green  light 
is  that  (by  definition)  only  the  r stations  that  require 
aocess  will  put  a signal  on  the  wire.  The  probability 
distribution  of  r is  a hypergeometric  distribution  nth 
parameters  N,  n and  u,  because  any  combination  of  which  n 
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out  of  N need  access  is  equally  likely  ' . Given  u and  r we 
can  distinguish  two  sets  of  stations:  one  set  has  u members 
out  of  which  r need  access  to  the  wire,  the  other  set  has 
N-u  members  out  of  which  n-r  need  access  to  the  wire.  For 
each  set  we  face  the  same  problem  as  the  initial  problem, 
with  minimum  expected  costs  T(u,r)  and  T(N-u,n-r), 
respectively.  Given  the  restriction  we  shall  proceed  with 
that  set  which  gives  the  lowest  minimum  expected  cost,  and 
this  explains  the  inside  minimization  in  (2:2.8).  We  have 
to  add  a term  L(r)  because  whenever  the  attempt  fails  (r/1), 
one  time  slot  is  "wasted".  Finally,  in  (2:2.8)  the  expected 
value  is  obtained  by  a weighted  summation,  and  the 
minimization  over  u gives  the  minimum  expected  cost  and 
indicates  the  optimal  control.  To  initialize  the  recursion 
note  that  for  n=1  and  n=0  the  minimum  expected  time  before 
one  station  acquires  the  wire  is  obvious. 

(2:2.9)  Table.  The  following  table  gives  for  entries  N,n  the 
optimal  number  of  stations  u to  have  green  light  (i.e.  the 
argument  of  the  minimization  over  u in  (2:2.8))  and  the 
minimum  expected  coat.  The  elements  in  the  table  have  the 
format 

•)  Remember  the  probability-theorist's  vase  in  which  there 
are  N beads,  n red  ones  and  the  other  N-n  are  white.  Out  of 
the  vase  we  take  (blindfolded)  u beads.  Then  the  number  of 
red  beads  in  our  drawing  is  modeled  by  a random  variable 
that  has  a hypergeometric  distribution  with  parameters  N,  n 
and  u. 
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M- 

n=2 

n=3 

n=4 

n=5 

r 

=6 

n=7 

n=8 

n=9 

PI  - 
2 

1 ; 0. 0 

3 

i ;0. 3 

1 ;0.0 

4 

2;0. 3 

1 ; 0. 3 

1 ; 0. 0 

5 

2;0.5 

2;0.4 

1 ; 0 . 2 

1 ; 0. 0 

6 

3;  0. 5 

2;0.4 

1 ; 0 . 4 

1 ; 0. 2 

1; 

0.0 

7 

3 ; 0 . 6 

3 ; 0. 5 

2;0.5 

1 ; 0. 3 

1 ; 

0.1 

1 ; 0. 0 

8 

4 ; 0. 6 

3 ; o . 5 

2;0.5 

2;0.5 

1 ; 

0.3 

1 ; 0. 1 

1 ; 0. 0 

9 

4 ; 0. 6 

3 ; 0 . 6 

2;  0. 6 

2;0.5 

1; 

0.4 

1 ; 0. 3 

1 ; 0. 1 

1 ; 0. 0 

10 

5;  0. 7 

4;  0. 6 

2 ; 0 . 6 

2;  0. 6 

2; 

0.5 

1 ; 0. 4 

1 ; 0. 2 

1 5 0. 1 

11 

5;0.7 

4;0. 6 

3 ; 0 . 7 

2;0.6 

2; 

0.5 

1 ; 0. 5 

1*0.3 

1 ; 0. 2 

12 

6;  0. 7 

5;0.6 

3 ; 0 . 7 

2;0. 6 

2; 

0.6 

2 ; 0. 5 

1 ; 0. 4 

i ; o.  3 

13 

6;  0. 7 

5;  0. 6 

4;  0.7 

2;0. 7 

2; 

0.6 

2;  0. 6 

2;  0. 5 

1 ;0. 4 

14 

7;0.7 

6;  0. 6 

4;  0. 7 

3 ; 0 . 7 

2; 

0.6 

2;0. 6 

2 ; 0. 5 

1 ; 0. 5 

15 

8;  0.7 

6;  0.7 

4;0.8 

3 ; 0 . 7 

2; 

0.7 

2;  0. 6 

2;  0. 6 

2;  0. 5 

16 

8;  0.7 

6;0.7 

4;  0. 8 

4 ; 0. 8 

3; 

0.7 

2;  0. 6 

2;  0. 6 

2;  0. 6 

17 

9;  0.8 

7 ; 0. 7 

4;0.8 

4 ; 0. 8 

3; 

0.8 

2 ; 0. 7 

2;0.6 

2;0.6 

18 

9;  0.8 

7;0.7 

4;0.8 

4;  0.  8 

3; 

0.8 

2;0.7 

2;0.7 

2;  0. 6 

19 

10; 0.8 

8;  0. 7 

5;0.8 

4;0.8 

3; 

0.8 

3;  0. 8 

2;0.7 

2;0.6 

20 

10; 0. 8 

8;  0.7 

5;  0. 9 

4 ; 0. 8 

4; 

0.8 

3 ; 0 . 8 

2;0.7 

2;0.7 

21 

10; 0. 8 

8;  0.7 

6 ; 0. 9 

4;  0.8 

4; 

0.8 

3 ; 0 . 8 

2 ; 0 . 8 

2;  0. 7 

22 

1 1 ; 0. 8 

9;  0. 7 

6;0.9 

4 ; 0.9 

4; 

0.8 

3 ; 0 . 8 

3;  0. 8 

2;0.7 

23 

1 2 ; 0 . 8 

9;0.7 

6;0.9 

4 ; 0. 9 

4; 

0.8 

4 ; 0.  8 

3;  0.8 

2;0. 7 

24 

1 2 ; 0 . 8 

1 0; 0. 7 

7;  0.9 

5;0.9 

4; 

0.9 

4 ; 0. 8 

3 ; 0. 8 

3;  0. 8 

25 

1 2 ; 0. 8 

10; 0.7 

7;  0.9 

5 ; 0 . 9 

4; 

0.9 

4 ; 0. 8 

3 ; 0 . 8 

3 ; o . 8 

26 

1 3 ; o. 8 

1 0; 0. 7 

8;  0.9 

6;  0. 9 

4; 

0.9 

4 ; 0. 8 

4 ; 0 . 8 

3 ; o . 8 

27 

1 4 ; 0. 8 

1 1 ; 0 . 7 

8;  0.9 

6;  0. 9 

4; 

0.9 

4;  0. 9 

4 ; 0. 8 

3 ; 0 . 8 

28 

14;0.8 

1 1 ; 0. 7 

8;0.9 

6 ; 0 . 9 

4; 

0.9 

4 ; 0 . 9 

4 ; 0. 8 

3 ; 0 . 8 

29 

1 4 ; 0 . 8 

1 2; 0. 7 

8;  0. 9 

6;  0.9 

5; 

0.9 

4 ; 0 . 9 

4 ; 0 . 9 

4 ; 0. 8 

30 

1 5 ; 0. 8 

1 2; 0. 7 

8;  0.9 

6;  1 . 0 

5; 

0.9 

4 ; 0 . 9 

4 ; 0. 9 

4 ; 0. 8 

31 

15; 0.8 

1 3; 0.7 

8;  0.9 

7;  1 .0 

6; 

0.9 

4;  0. 9 

4 ; 0. 9 

4 ; 0 . 9 

32 

1 6; 0. 8 

1 3 ; 0- 7 

8 ; 0 . 9 

7;  1.0 

6; 

0.9 

4 ; 0 . 9 

4;  0.9 

4;  0. 9 

33 

1 6; 0. 8 

1 3; 0.7 

9 ; 0 . 9 

7;  1 .0 

6; 

1.0 

5;  0.9 

4 ; 0. 9 

4;0.9 

34 

1 7 ; 0 . 9 

1 4; 0. 7 

9 ; 0 . 9 

8;  1 .0 

6; 

1.0 

5;  0.9 

4 ; 0. 9 

4 ; 0 . 9 

35 

17 ; 0. 9 

1 4 ; 0 . 7 

10; 1.0 

8;  1 .0 

6; 

1.0 

5;  1 .0 

4;  0. 9 

4;  0.9 

38 


n*2 

n*3 

n*4 

n*5 

n=6 

n=7 

n= 

8 

n=9 

N> 

36 

18 

; o . 9 

15 

; 0. 7 

10;  1 

.0 

8;  1 

.0 

6;  1.0 

6;  1.0 

4 ; 0 

.9 

4;0.9 

37 

19 

; 0.9 

15 

; 0. 7 

10;  1 

.0 

8;  1 

.0 

7;  1.0 

6;  1.0 

4;0 

.9 

4;0.9 

38 

19 

;0.9 

15 

; 0.7 

1 1 5 1 

.0 

8;  1 

.0 

7;  1 .0 

6;  1.0 

5;  0 

.9 

4 ; 0. 9 

39 

19 

; o.  9 

16 

; 0. 7 

1 1 ; 1 

.0 

8;  1 

.0 

7;  1 .0 

6;  1.0 

5;  1 

.0 

4 ; 0. 9 

40 

20 

; o.  9 

16 

; 0.7 

12;  1 

.0 

8;  1 

.0 

8;  1.0 

6;  1.0 

5;  1 

.0 

4 ; 0. 9 

41 

20 

; o.  9 

17 

; 0. 7 

12;  1 

.0 

8;  1 

.0 

8;  1.0 

6;  1 .0 

6;  1 

.0 

4;  0.9 

42 

21 

; o . 9 

17 

; 0. 7 

12;  1 

.0 

8;  1 

.0 

8;  1.0 

6;  1.0 

6;  1 

.0 

5;  1 .0 

M3 

21 

; 0. 9 

18 

; 0 . 7 

12;  1 

.0 

9;  1 

.0 

8;  1.0 

7;  1.0 

6;  1 

.0 

5;  1 .0 

44 

22 

; 0 . 9 

18 

; 0. 7 

13;  1 

.0 

9;  1 

.0 

8;  1 .0 

7;  1.0 

6;  1 

.0 

5 ; 1 . 0 

45 

22 

;0.9 

18 

; 0.7 

13;  1 

.0 

10;  1 

.0 

8;  1 . 0 

7;  1.0 

6;  1 

.0 

5;  1.0 

46 

23 

; 0.9 

19 

; 0. 7 

14;  1 

.0 

10;  1 

.0 

8;  1 . 0 

7;  i.o 

6;  1 

.0 

6;  1 . 0 

47 

24 

; 0 . 9 

19 

;0.7 

14;  1 

.0 

10;  1 

.0 

8;  1.0 

8;  1.0 

6;  1 

.0 

6 ; 1 . 0 

48 

24 

; 0. 9 

20 

; 0. 7 

14;  1 

.0 

10;  1 

.0 

8;  1 . 0 

8;  1.0 

6;  1 

.0 

6;  1 . 0 

49 

25 

; 0 . 9 

20 

; 0 . 7 

14;  1 

.0 

1 1 ; i 

.0 

8;  1 .0 

8;  1 . 0 

7;  1 

.0 

6 ; 1 . 0 

50 

25 

; 0. 9 

20 

; 0. 7 

15;  1 

.0 

ii;  i 

.0 

8;  1.0 

8;  1.0 

7;  1 

.0 

6;  1.0 

(2:2.10)  Solution.  (Case  (s)).  Given  the  same  minor 
restriction  as  in  solution  (2:2.7),  and  in  addition  the 
restriction  that  —according  to  the  specification  of  case 
(s) — the  solution  must  be  symmetric,  then  the  minimum 
expected  length  of  the  interval  of  contention  is 

T(n0 ) 

with  T(-)  given  by  the  recursive  relation: 

(2:2.11)  n-1 

, min  B(0 !n,v)+B(n! n,v)+  I B(r ! n,v) ( 1+min{T(r) ,T(n-r) } ) 

T(n)=mln{ C =2 } 

v 1-B(0 | n,v)-B(n| n,v) 

where 

B(r | n,v)= (")vr( 1-v)n“r  (Binomial 

distribution) 

the  "boundary  conditions"  are  given  by 

T(0 )s  co 
T(  1 ) = 0 


and  the  optimal  controls  are  also  given  by  (2:2.11)  in  the 
following  way:  if  at  the  beginning  of  time  slot  t there  are 
N stations  considered  for  possibly  having  green  light  out  of 
which  n need  access,  then  v,  the  argument  of  the 
minimization  in  (2:2.11)  gives  the  optimal  probability  with 
which  the  stations  in  the  set  under  consideration  are  to 
have  green  light  for  time  slot  t.  Let  the  number  of  stations 
that  transmit  a packet  in  time  slot  t be  r.  Then  the  set  of 
stations  to  be  considered  for  green  light  in  the  next  time 
slot  are  the  stations  that  just  had  green  light  (out  of 
which  r need  access)  if  T(r)lT(n-r)  otherwise  it  is  the 
subset  of  the  stations  that  just  were  under  consideration 
but  that  did  not  have  green  light  (out  of  which  n-r  need 
access) . 

Proof.  Assume  that  (2:2.11)  is  already  proven  to  be  correct 

when  the  argument  of  T(>)  takes  values  2 n-1. 

In  the  symmetric  case  it  is  necessary  to  consider  randomized 
decisions  (cf.  theorem  (2:1.2)).  For  each  time  slot  a binary 
decision  has  to  be  made  by  the  stations  under  consideration, 
therefore  we  have  to  specify  for  each  time  slot  the 
probability  of  success  for  the  Bernoulli  trials  that  will 
determine  at  each  station  in  the  set  under  consideration  the 
probability  with  which  that  station  will  have  green  light. 
Let  the  probability  of  the  Bernoulli  trials  be  v.  Then  the 
probability  distribution  of  r,  the  number  of  stations  with 
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x^sl  within  the  set  under  consideration  that  had  green 
light,  depends  only  on  n,  the  number  of  stations  within  the 
set  under  consideration  that  initially  needed  access  —only 
those  stations  can  contribute  to  r — and  has  the  binomial 
distribution  with  parameters  n and  v.  After  a given  outcome 
r of  the  random  variable  r there  are  two  sets  of  stations: 
one  set  of  stations  that  Just  had  green  light  out  of  which  r 
need  access  and  another  set  of  stations  that  did  not  have 
green  light  out  of  which  n-r  need  access.  Given  the  minor 
restriction  we  shall  proceed  optimally  with  that  set  with 
the  smaller  expected  time  before  one  station  acquires  the 
wire.  Lets  denote  the  expected  time  before  one  station 
acquires  the  wire  with  this  strategy  by  Tv(n).  With 
probabilities  B(0!n,v)  and  B(n'n,v)  r will  take  the  values  0 
and  n respectively;  in  both  cases  we  spend  1 time  slot  and 
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(2:2.13)  Table.  The  following  table  gives  for  entries  n the 
optimal  probability  with  which  the  stations  in  the  set  under 
consideration  should  have  green  light  (i.e.  the  argument  of 
the  minimization  over  v in  (2:2.11))  and  the  minimum 
expected  cost.  The  elements  in  the  table  have  the  format 

A 

v; 

T(n) 


n=2  n= 3 n=4  n=5  n=6  n=7  n=8  n=9 

0.50;  0.41;  0.31;  0.24;  0.19;  0.17;  0.15;  0.13; 

1.0  0.8  1.1  1.2  1.3  1.3  1.3  1.3 


It  can  be  seen  from  tables  (2:2.9)  and  (2:2.13) 
that  T(N,n)  approaches  T(n)  as  N gets  larger.  Indeed  we  have 
here  what  was  defined  in  section  (2:1)  as  asymptotic 
optimality  of  the  symmetric  solution.  This  will  be  stated 
in  theorem  (2:2.19).  The  proof  of  the  theorem  is  based  on 
the  two  lemmas  given  below  and  the  fact  that  the 
hypergeometric  distribution  approaches  the  binomial 
distribution  as  the  population-size  N ■»  oo. 

(2:2.14)  1 . Let  T(N,n)  be  the  minimum  expected  time 
needed  before  one  station  has  sole  access  when  n out  of  N 
need  access,  as  given  by  (2:2.8).  Then  for  fixed  n,  111, Hi 
lanon-decreasing  or  more  precisely 

T(Nf1,n)lT(N,n)  VN^n 


N . 


1*2 

and  the  limit  for  N ■»  oo  exists,  i.e.  3 a finite  number  Tn 
such  that 

Tn=  j$2>  T(N,n) 

Proof.  To  show  that  T(*,n)  is  non-decreasing  consider  the 
situation  where  there  are  N stations  out  of  which  n need 
access.  Applying  the  optimal  strategy  the  expected  time 
before  one  station  has  sole  access  is  T(N,n).  Another  — but 
probably  suboptimal — way  of  solving  the  same  access  problem 
is  by  adding  one  (imaginary)  station  which  does  not  need 
access  and  applying  the  optimal  strategy  for  the  case  where 
n out  of  N+1  need  access.  Thus  we  ignore  the  knowledge  that 
the  additional  station  belongs  to  the  set  that  doesn't  need 
access,  and  the  expected  time  before  a single  station  has 
access  is  T(N+1,n).  By  definition  we  can  not  improve  on  the 
optimal  strategy,  certainly  not  by  this  particular 

procedure.  Therefore 

T(N+1 ,n)lT(N,n) . 

The  existence  of  the  limit  now  follows  directly  from  the 
fact  that  T(N,n)  is  bounded  by  T(n): 

T(N,n)iT(n)  ¥N2n 

Each  side  of  the  inequality  is  the  cost  achieved  by  an 
optimal  access  strategy,  but  in  the  case  corresponding  to 
the  left-hand  side  the  strategy  is  based  on  more 

information,  which  implies  the  inequality. 


(2:2.16)  In—  . Let  u be  the  optimal  number  of  stations 
with  green  light,  as  given  by  the  minimization  in  (2:2.8). 
Then  for  fixed  n,  u ■»  oo  and  N-u  •»  oo  as  N ■»  oo.  More 
precisely, 

VZ  3M  such  that  N>M  =>  u>Z  and  N-u>Z 
Proof.  Suppose  the  lemma  is  not  true.  Then  3z  such  that  VM, 
no  matter  how  large,  3N>M  with  the  corresponding  u<Z.  Given 
any  small  6>0  3Mg  such  that  VN>Me 

(2:2.17)  H(0|N,n,u)>1-e  for  u<Z 

i.e.  when  the  total  number  of  stations  is  made  large  enough 
and  the  number  of  stations  with  green  light  is  known  to  be 
less  than  Z then  the  probability  of  0 stations  accessing  the 
wire  can  be  made  arbitrary  close  to  1 . From  the  existence 
°f  T(N,n)  we  know  that  for  the  same  e,  3Mfe  such  that  for 
N>M£ 

(2:2.18)  T(N,n)-6<T(N-u,n)  for  u<Z 

Now  let  M=max{Me,Mg}  and  take  N>M  such  that  the 

corresponding  u< Z . From  (2:2.8)  and  the  definition  of  u we 
get  directly  that 

T(N,n)lH(0!N,n,u)(1+T(N-u,n)) 

Using  the  inequalities  (2:2.17)  and  (2:2.18)  this  gives 

T(N,n)>(1-6)(1+T(N,n)-e) 

This  leads  to  a contradiction  since  e could  be  chosen 
arbitrarily  small,  which  proves  the  u oo  .part  of  the 
lemma.  To  proof  the  N-u  ->  oo  -part  we  can  repeat  the  same 
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argument  with  N-u  and  u mutually  interchanged  (except  where 
u appears  as  an  argument  of  the  hypergeometric  distribution 
H(-))  and  with  H(n|N,n,u)  instead  of  H(0|N,n,u)  i.e.  when 
the  total  number  of  stations  is  made  large  enough  and  the 
number  of  stations  with  green  light  is  known  to  differ  less 
than  Z from  N then  the  probability  of  all  n stations 
accessing  the  wire  can  be  made  arbitrary  close  to  1. 

(2:2.19)  Theorm*.  The  symmetric  solution  (2:2.10)  is 
asymptotically  optimal  in  the  sense  that  for  fixed  n 

T(N,n)sT(n) 

i.e.  the  minimum  cost  of  the  unrestricted  solution  (2:2.7) 
approaches  the  minimum  cost  of  the  symmetric  solution  as  the 
number  of  stations  N ->  oo. 

Proof.  First  assume  the  theorem  already  proven  for 

n' s 1 , . . . ,n-1 . In  lemma  (2:2.14)  we  defined  T(N.n)  as  Tn 

and  from  the  proof  of  the  lemma  it  is  obvious  that  Tn£T(n). 

So  it  is  sufficient  to  show  that  Tn2.T(n).  Let  e be  an 

arbitrary  small  positive  number.  As  a direct  consequence  of 

lemma  (2:2.14)  3z  (for  this  C)  such  that  u>Z  => 

T(u,n’ )>Tni -6  for  n's1,...,n.  From  a limit  theorem  for  the 

hypergeometric  distribution  (see  [9]  p.59)  it  follows  that 

3Mg  such  that  N>Mg  s>  H(r',N-n,u)>B(ri  n,H)-e,  where  B(  • ) 

N 

stands  for  binomial  distribution.  By  lemma  (2:2.16)  3Mjr 
such  that  for  N>Mz  the  argument  of  the  minimization  in 
(2:2.8)  is  guaranteed  to  lie  between  Z and  N-Z.  Therefore  we 
can  derive  from  (2:2.8)  that  for  N>max{Mg,Mz} 


min  n 

= Z<u<N-Z*r?oH^r'lN,n’U^L*r^+min*T*U,r^  »T^N"u»n_r^  i ) ) 
>mJn{r?0(B(r|n'v)-e)(L(r)+mintTr’»Tn-rJ-e)} 

Since  this  can  be  shown  for  any  6>0  we  may  conclude  that 

Tn2mJn{rZ0(B(r!n,v))(L(r)+min{Tr,Tn.r})} 

This  inequality  can  equivalently  be  formulated  as 
n 

Tn2{  Z (B(r|n,v))(L(r)+min{Tr,Tn_r})}  for  some  v 
r=0 

Now  Tn  can  be  brought  to  the  left-hand  side,  and  using  the 

assumption  at  the  beginning  of  the  proof  and  the  facts  that 

L( 1 )=T( 1 ) , T(0)=  oo  we  find 

n-1 

B(0 |n,v)+B(ni n,v)+  Z B(r ! n,v) ( 1+min{T(r) ,T(n-r) } ) 

Tn2 *22 

1-B(0 ! n,v)-B(n! n,v)  for  some  v 


Which  implies 


n-1 


, B(0 ! n,v)+B(n| n,v)+  Z B(r I n,v) ( 1+min{T(r) ,T(n-r) } ) 

Tn2m;n{ C=2 } 

1 — B( 0 !n,v)-B(n! n,v) 

The  right-hand  side  is  the  same  as  in  equation  (2:2.11), 
thus 

TnlT(n) 


To  complete  the  Induction  argument  observe  that  for  n=1  the 
inequality  is  trivial  because  T(N , 1 )=Ti=T( 1 )=0.  Q.E.D. 


Finally  we  shall  consider  the  case  where  there  is 
only  1 bit  of  on  line  information  on  which  to  base 
decisions,  besides  the  knowledge  of  the  initial  conditions 
(no»No).  This  cost  will  give  us  an  upperbound  on  the  minimum 
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cost  for  many  other  cases  that  are  not  treated  here,  such  as 
the  case  where  the  stations  can  determine  whether  there  is  a 
collision  but  not  how  many  stations  are  involved  in  the 
collision  (cf.  remark  (2:2.6)).  This  solution  is  also 
mentioned  in  [10],  where  it  is  recognized  that  no  may  also 
not  be  available  information.  Therefore,  in  the  paper  just 
mentioned,  the  solution  is  combined  with  a simple  heuristic 
estimation  procedure  for  ng. 

(2:2.20)  Solution.  (Case  (m)).  When  the  on-line  information 
tells  only  whether  in  the  last  time  slot  one  station  was 
successful  in  acquiring  the  channel  (in  which  case  the 
problem  ceases)  or  not,  then  the  minimum  time  before  one 
station  acquires  the  channel  is 
(2:2.21)  TnJm)=(1-1/n0)n°"1-1 

and  the  optimal  strategy  is  that  each  station  attempts  each 
time  slot  with  probability  1/no  t0  access  the  wire  as  long 
as  no  station  has  so  far  been  successful  in  acquiring  the 
channel. 

Proof.  Let  n be  the  number  of  stations  that  need  access  and 
suppose  that  in  the  first  time  slot  all  stations  have  green 
light  with  probability  v.  Then  this  attempt  will  be 
successful  with  probability  p-|£nv(  1-v)n_1  and  the  cost  (i.e. 
number  of  time  slots  before  one  station  acquired  the 
channel)  is  0.  But  with  probability  1-p-|  the  attempt  will  be 
unsuccessful  in  which  case  we  incur  cost  1 and  we  face  the 
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second  time  slot  the  same  problem  as  we  did  initially. 
Therefore  the  minimum  cost  must  satisfy  the  following 
equation 

T(m)=min{(1_pi)(1+TU))} 

(2:2.22)  =>  T£m)=mJn{JbEl}  pi  =nv(  1 -v) n_1 

Pi 

by  straightforward  calculus  we  find  that  the  minimum  is 
attained  for  v=1/n.  Substituting  this  value  of  v in  the 
equation  above  gives  us  immediately  (2:2.21). 

(2:2.23)  Table.  The  following  table  gives  for  entries  n the 
optimal  probability  with  which  the  stations  should  have 
green  light  (i.e.  the  argument  of  the  minimization  over  v in 
(2:2.22))  and  the  minimum  expected  cost.  The  elements  in 


the  table 

have  the 

format 

a 

v; 

T(m) 

;n 

n=2 

n=3 

n=4  n=5  n=6 

n=7  n=8  n=9 

0.50; 

0.33; 

0.25;  0.20;  0.17; 

0.14;  0.13;  0.11; 

* 

1 .0 

1.3 

1.4  1.4  1.5 

1.5  1.5  1.6 

• 

(2:2.2b) 

Conclusion.  Comparing  tables 

(2:2.9),  (2:2.13)  and 

(2:2.23) 

we  see 

that  the  cost  of  the 

symmetric  solution  is 

close  to 

the  cost 

of  the  unrestricted 

solution  for  large 

values  of  Nq  — 

say,  Nq=50  or  larger 

— and  small  values  of 
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rig.  It  is  only  important  to  consider  small  values  of  ng, 
because  for  a wire  which  is  not  overloaded  ngz2  is  the  most 
likely  initial  value  and  each  next  higher  value  is  an  order 
of  magnitude  less  likely  than  its  preceding  value.  When  the 
wire  is  overloaded  other  controls  than  the  ones  considered 
in  this  section  should  be  employed.  Note  that  the  solutions 


of  case  (s) 

and  case 

(m) 

are  independent 

o 

z 

o 

This 

independence 

enhances 

the 

properties  of 

simplicity 

and 

robustness . 

When  some 

additional  cost  is 

added  to 

the 

derived  cost  of  case  (u)  to  account  for  greater  complexity 
and  less  robustness,  then  the  solution  of  case  (s)  will 
always  be  superior  the  solution  of  case  (u)  for  Nn  large 
enough.  The  numerical  results  suggest  that  this  value  need 
not  be  unpractically  large.  If  simplicity  is  a factor  with 
great  weight,  then  even  case  (m)  could  have  the  best 
solution. 


•t  I 
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CMPIEfi  .3;  PEC6MIBALIEEB  CQMI3Q4 

ULPAC-KET  SWITCHED  SATELLITE  COMMUNICATION 


2;1  IntroduaUan  and  Model. 


The  packet  switched  satellite  multi-access 
broadcast  channel  is  becoming  an  ever  more  important  medium 
for  digital  communication.  Finding  control  schemes  that 
ensure  an  efficient  use  of  the  channel  is  a non-trivial 
problem,  mainly  because  any  station  that  requires  access  to 
the  channel  at  a given  time  can  only  determine  with  a fixed, 
significant  delay  whether  another  station  accesses  the 
channel  at  the  same  time,  where  the  delay  is  the  propagation 
delay  for  electromagnetic  waves  from  a ground  station  to  the 
satellite  and  back  to  the  stations  plus  the  transmission 
dftlay  Jfor.. putting  one-  jacket- on- the- -channel-. --When-—  two-  or 
more  stations  access  the  channel  at  the  same  time  a 
"collision"  occurs,  in  which  case  the  packets  involved  need 
to  be  retransmitted.  We  have  here  a control  problem  with 
many  decision  makers  (the  stations)  each  having  partial 
information  on  the  state  of  channel  and  the  states  of  the 
stations. 


(3:1.1)  taark.  In  the  following  model  we  are  dealing  with 
"new"  packets.  Transmitted  packets  are  usually  stored  at  a 
station:  for  comparison  with  the  echo  from  the  satellite  to 


determine  whether  a collision  occurred  and  to  have  them 

available  for  retransmission  when  there  was  a collision. 
Retransmissions  enter  only  in  the  present  formulation 
through  the  cost  of  a collision  K. 

(3:1.2)  Model.  (Description. ) Each  station  has  a single 
buffer  that  can  hold  one  packet.  Packets  are  of  fixed 

length  and  it  takes  one  unit  of  time  to  take  in  or  to  send 
out  one  packet.  The  arrivals  into  the  buffer  and  departures 
from  the  buffer  are  slotted,  i.e.  they  3tart  at  integer 

instants  of  time.  The  arrivals  at  one  station  are 
independent  from  those  at  all  other  stations  and  form  a 
Bernoulli  process  with  constant  parameter  p,  which  is  the 
same  for  all  stations.  This  means  that,  at  each  station,  a 
time  slot  will  bring  a packet  with  probability  p.  The 

"de'paTtures- are-eontr-olleci..by.  Jfehe_  stations  on  a decentralized 
basis.  There  will  be  a packet  leaving  a station,  up  to  the 
satellite,  when  at  the  beginning  of  a time  slot  the  buffer 
of  that  station  is  full  and  the  station  has  "green  light 
on".  The  buffer  may  be  filled  with  a new  packet  at  the  same 
time  that  the  previous  packet  goes  out.  Arrivals  that  occur 
when  the  buffer  is  full  are  refused.  If  two  or  more 

stations  transmit  a packet  at  the  same  time,  then  these 
transmissions  will  be  unsuccessful;  it  is  said  that  a 
collision  occurred. 
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(3:1.3)  Model.  (Diagram.  ) for  each  station  i we  have: 


departures 


ui(t) 


V1  ( t ) 

i 

x 


arrivals 


1 


xi(t) 


buffer 

The  variables  x,u  and  v for  station  i take  values  in  {0,1} 
according  to 


xi(t)=0 

si 

Uj_  ( t ) s0 
si 

Vi(t)=0 

si 


buffer  is  empty 
buffer  is  full 

red  light 
green  light 

no  packet  arrives 
packet  arrives 


Xi(t)  is  the  state  of  station  i,  ui(t)  is  the  control  of 
station  i and  { v^ ( t ) i tsl,2,...}  is  a Bernoulli  process  with 
parameter  p. 


(3:1.*)  KxprMiiou.  Although  the  variables  involved  are 
basically  Boolean,  we  shall  write  the  expressions  for  the 
dynamics  and  cost  as  real  algebraic  expressions  for  later 
convenience.  The  next  state  of  station  i is  given  by 
(3:1.5)  Xi(t+1  Jsv^tMl-uiU) ) (1-Vi(t)  )xi(  t) 

He  define  units  of  cost  by  stating  that  one  packet  which 
waits  one  time  slot  (because  u^(t)=0)  represents  one  unit  of 
delay.  Let  the  controls  of  all  stations  at  all  time  slots 
under  consideration  { uj,( t ) 1 i=1...,N,  t=1,...,T}  be  given. 
Then  the  expected  cost  for  the  whole  system  expressed  in 
units  of  delay  Is  given  by  the  somewhat  clumsy  expression 
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T N 

(3:1.6)  C(u)=E{  Z { v (i+B)xi(t)(1-Ui(t)) 

v — 1 IS  1 

N N N 

+K  ^ 1 "i?!  ( 1 ”xi  ( t ) ui  ( t ) ) Xi  ( t ) ui  ( t ) ^ (l-xi(t)ui(t))])} 

where  B is  the  cost  of  having  the  station  blocked,  i.e.  the 

cost  of  refusing  packets  if  they  arrive,  and  K is  the  cost 

of  collision.  The  first  term  in  (3:1.6)  adds  (1+B)  units  of 

delay  whenever  a buffer  is  full  and  no  transmission  takes 

place.  In  the  second  term  the  factor  which  multiplies  K 

takes  the  value  1 if  at  time  t there  are  two  or  more 

stations  which  have  green  light  and  their  buffers  full; 

otherwise,  its  value  is  0. 

The  objective,  of  course,  is  to  minimize  cost. 
However  at  this  point  this  statement  is  not  well-defined. 
The  reason  is  that  the  expression  (3:1.6)  gives  the  expected 
cost  for  given  values  of  {u^t)',  i=1,...,N,  t=1,...,  T},  but 
these  values  will  typically  not  be  predetermined.  Instead 
they  will  be  a function  of  the  information  available  to  the 
--st-attona.-at.  ev^ry. instant  of  time,  which  by  the  nature  of 
our  model  is  a random  variable  and  cannot  be  known  a" priori r 
Now  we  shall  define  the  information  on  which  station  i at 
tine  t could  base  its  decision  ui(t). 


(3:1.7)  Definition.  When  there  is  a team  of  N decision 
makers,  each  one  observing  a component  x^(t)  of  the  state 
vector  *(t),  and  if  eaoh  decision  maker  can  inform  all 
others  about  his  observation  with  delay  d (integer  uni£ja~^P*~‘ 


53 


time)  then  we  call  this  a delayed  sharing  of  state 
Information  structure.  It  is  assumed  that  each  decision 
maker  has  no  limitation  with  respect  to  storing  information. 
This  definition  is  a variant  of  a more  general  definition  of 
delayed  sharing  patterns  [11]. 

(3:1.6)  Definition.  An  admissible  decision  rule  or 
admissible  control  law  is  a collection  of  maps  gs  { gi ( t ; • ) J 
i=1,...,N,  ts1,...,T}  which  determine  i’s  control  at  time  t 
as  a function  of  i's  information  at  time  t.  That  means  we 
can  write 

x1(t-d),...,X|(1) 

• • 

(3:1.9)  ui(t)  = gj.(t;xi(t) ,. . . ,Xi(t-d) ,xi(1)) 

• • 

• • 

xn(t-d) ,. . . ,xM(1) 

in  the  case  of  delayed  sharing  of  state  information.  We 
shall  denote  the  class  of  admissible  control  laws  by  G.  An 
admissible  control  law  gs  { gi ( t ; - ) | isi,...,N,  tsi,...,T}  is 
said  to  be  open  loop  if  all  the  functions  gi(t;«)  are 
constant  maps,  i.e.  the  decision  of  station  i at  time  t is 
independent  of  It's  TnTiyrwation  -at  time  -t  . The  -sub.daaa  -Q.f_.  _G, 
which  contains  all  open  loop  control  laws  will  be  denoted  by 
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(3:1.10)  Baaark.  Besides  the  information  which  is  mentioned 
in  definition  (3:1.7)  and  written  out  in  (3:1.9)  explicitly, 
it  is  also  assumed  that  implicit  information  is  available  to 
the  stations,  namely:  the  underlying  model,*)  the  value  of 
the  parameter  p of  the  arrival  processes  and  the  control  law 
which  is  in  effect.  One  consequence  of  this  assumption  is 
that  station  i not  only  knows  the  state  history  of  the  other 
stations  with  delay  d,  but  it  also  knows  their  control 
history  with  the  same  delay.  Another  consequence  is  that 
unknown  quantities  such  as  uj(s),  xj(s)  j*i,s>t-d  are  well 
defined  random  variables  from  the  viewpoint  of  station  i. 


*)  The  knowledge  of  station  i tfiaVlt- -is- -number  i.  whence 
its  control  law  can  be  a function  of  the  number  assigned' tb- 
it, is  here  considered  implicit  in  knowing  the  underlying 
model.  Thus,  in  terms  of  chapter  2,  symmetric  solutions  are 
not  considered.  The  motivation  is  that,  compared  with  a wire 
communication  system,  there  are  in  a satellite  communication 
system  not  a large  number  of  stations;  a-priori  agreement 
on  which  station  has  which  number  should  be  assumed. 


> I 
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(3:1.11)  Problaa  Formulation.  Now  we  have  defined  the  class 
of  admissible  control  laws,  we  can  state  the  problem  is  we 
are  trying  to  solve.  First  note  that  for  a given  control 
law  g,  the  quantities  that  appear  in  the  expression  for 
cost  (c.f .(3: 1.6))  are  well  defined  random  variables. 
Let’s  denote  the  expected  cost  for  given  g by  C(g).  Then  the 


problem  is  to  find  g which  minimizes  C(g),  i.e. 


; 
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3:2  Optimal  Control  Law  for  New  Packets. 

We  start  with  a theorem  which  reduces  the 
complexity  of  our  problem  tremendously  by  stating  that,  with 
a reasonable  assumption,  the  optimal  control  law  for  our 
single  buffer  model,  with  delayed  sharing  of  state 
information,  will  be  an  open  loop  control  law. 

(3:2.1)  Theoroa.  Assume  that  each  station  i has  green  light 
at  least  once  within  any  interval  of  d consecutive  time 
slots: 

Vt  3r,  t£r<t+d  such  that  Ui(r)=1 
Then  the  minimum  cost  will  be  attained  within  the  subclass 
G0: 

X(C(g!)  * glo<C(g)) 

Proof.  It  is  sufficient  to  show  that  at  any  instant  any 

station  1 cannot  improve  its  decision  by  using  its 
information  instead  of  no  information.  If  Xi(t)=0  then  the 
cost  incurred  at  stage  t as  well  as  the  next  state  x^(t+1) 
is  independent  of  u^(t)  (c.f.  (3:1.5)  and  (3:1.6)). 

Therefore  u^(t)  can  be  taken  the  same  as  when  Xi(t)=1,  which 
implies  that  gi(t;>)  can  be  a constant  map  with  respect  to 
Xj[(t).  Furthermore,  note  that  since  x*(  t-1 ) , . . . ,Xi(  t-d+1 ) 
are  unknown  to  the  other  stations  and  since  the  arrival 
processes  at  the  stations  are  uncorrelated,  they  give  no 
information  on  the  present  state  of  the  other  stations, 
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which  implies  that  g^(t;*)  also  can  be  a constant  map  with 
respect  to  x^(  t-1 ) , . . . ,x^(  t-d-fl ) . At  this  point  the  only 
possibly  relevant  arguments  of  the  function  g^ ( t ; - ) are 
{ x j ( s ) S j= 1 , . . . , N , s=t-d ,t-d-1 , . . . ,2, 1 } , which  we  call  the 
delayed  state  history.  However,  because  of  the  assumption 
of  the  theorem,  we  know  that  all  stations  have  had  uj(r)=1 
for  at  least  one  value  of  r in  the  interval  t-1, ...,t-d.  If 
uj(r)=1  then  xj(r+1)  is  independent  of  the  value  of  xj(r), 
and  as  a consequence  the  present  state  of  the  system  is 
independent  of  the  delayed  statV  information  and  therefore 
gi(t;»)  can  be  a constant  map  with  respect  to  this 
information  as  well.  Altogether  it  means  that  gi(t;>)  can 
be  restricted  to  be  a constant  map  without  increasing  the 
value  of  the  minimum  cost. 

The  next  theorem  states  that  there  is  no  reason  to 
enlarge  the  solution  space  to  include  randomized  decisions 
(however,  cf.  previous  footnote).  For  clarity,  we  shall 
consider  only  randomizing  over  the  set  of  open  loop  control 
laws  G0.  However  a similar  argument  could  be  given  if  we 
were  to  randomize  over  the  set  G.  Thus  theorem  (3:2.3)  is 
in  fact  independent  of  theorem  (3:2.1) 


(3:2.2)  Dm rial t ion.  A randomized  decision  rule  or 
randomized  control  law  is  given  by  a probability  measure  Pm 
on  the  set  of  control  laws  G0,  which  specifies  for  each 
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control  law,  the  probability  that  it  will  be  in  effect. 
E.g.,  Let  N=2  and  T=2.  Then  G0  has  2^  elements.  The 
randomized  decision  rule  corresponding  to  'each  station  has 
each  time  slot  green  light  with  probability  .5  and  these 
events  are  independent'  is  given  by  the  probability  measure 
that  assigns  to  each  of  the  elements  in  G0  probability  2-14. 

(3:2.3)  ThiorM.  Let  Gr  denote  the  space  of  randomized 
control  laws  and  G0 , as  before,  the  space  of  open  loop 


control  laws  then 

!!«!,.<«»>>  = 5l80fc<u>> 

i.e.,  the  minimum  cost  is  achieved  within  the  class  of  open 
loop  laws. 

Proof.  Suppose  we  number  the  elements  of  G0  such  that  G0= 

{ui ,U2, . . . ,uc} , where  c is  the  number  of  elements  in  G0. 

Then  any  randomized  control  law  is  given  in  an  obvious  way 

by  a sequence  of  probabilities  (pi ,P2i • • • >Pc) » such  that 
c 

Z pj_sl.  The  expected  cost  for  such  a randomized  control 

^■=  ^ Q 

law  is  i^1PiC(ui).  Now  let  t=arg{m^nC(ui) } , then  the 
expected  cost  C(uj)  for  the  open  loop  control  law  u£  is 
clearly  not  larger  then  the  expected  cost  for  that 
randomized  control  law.  Thus 


But  from  the  fact  that  G0  is  a subset  of  Gr  we  know  that 

XIC(U))  * SeS 
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This  proves  the  theorem,  which  is  well  known  in  Bayesian 
decision  theory  (Cf.  [12],  Ch.  8). 

(3:2.4)  Model.  The  variable  x^(t)  which  describes  the  state 
of  the  buffer  of  station  i at  time  t takes  only  the  values  0 
and  1.  Therefore  the  probability  of  the  buffer  being  full  is 
Pr{xi(t)  = 1)  $ x ( t ) = E[xi(t)} 

Under  an  open  loop  control  law,  given  by  a predetermined  set 
of  values  of  the  control  variables  u={ui(t)i  i=1,...,N, 
t=1,...,T},  the  evolution  of  xi(t)  is  deterministic  and  can 
be  found  from  (3:1.5)  by  taking  expectations  on  both  sides 
of  the  equation  resulting  in 
(3:2.5)  x ( t+ 1 ) = p+(1-ui(t))(1-p)xi(t) 

Similarly  the  expected  cost  C(u)  for  an  open  loop  control 
law  represented  by  u can  be  found  from  (3:1.6)  by  taking 
expectations  using  the  fact  that  for  given  u: 

E{xi(t)xJ(t)}  = E{xi(t)}E{xj(t)}  = Xi(t)xj(t)  (i*j) 
because  x^(t)  and  xj(t)  are  independent  random  variables. 
This  gives 

T N 

(3:2.6)  C(u)  = i2:i{iZi(i+B)xi(t)(1-ui(t))  + 

N N N 

Kt ( 1-ui( t)fci( t) )-jZ^Ui( t)$i( t) ^i(1-uj(t)xj(t)) ] } 

(3:2.7)  Frofelaa.  In  the  new  model  formulation  above,  which 
was  based  on  the  knowledge  that  the  control  law  would  be 
open  loop  we  can  interpret  the  vector  x(t)  = 
(xi  (t) ,. . . ,xjj(t))'  as  the  new  (but  now  deterministic)  state 
of  the  system.  From  this  viewpoint  the  problem 
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5eo0(c<u)1 

with  the  cost  C(u)  given  by  (3:2.6)  and  the  dynamics  by 
(3:2.5)  constitutes  a dynamic  programming  problem. 


(3:2.8)  iMTk.  Although  there  are  well-known  solution 
techniques  for  dynamic  programming  problems,  none  of  those 
will  allow  us  to  solve  our  problem  for,  say,  more  than  3 or 
4 stations,  because  then  the  state  space,  which  is  the  N 
dimensional  hyper-cube  [0,1  ]N,  becomes  "too  big".  This 
limitation  is  known  as  "the  curse  of  dimensionality". 


The  conclusion  is  that  we  need  to  restrict  even 
further  our  solution  space  if  we  are  to  find  the  optimal 
decision  rule.  The  next  restriction  serves  this  purpose. 

(3*2.9)  ■••triotion.  Let  q0(t)  be  the  probability  that  at 
time  t the  channel  is  silent,  that  is:  no  packet  leaves  any 
station.  Then  we  shall  restrict  ourselves  to  solutions  for 
which  iio.(.ti  .1?  .Q9.n9ta_nt_9.y_3r  .time. 

This  restriction  seems  not  unreasonable  because  we 
want  to  spread  out  transmissions  as  evenly  as  possible  in 
order  to  minimize  the  chance  of  collision  for  a given 
traffic  intensity,  which  should  result  in  a constant 
probability  of  no  traffic.  In  fact  we  could  conjecture  that 
this  is  not  a restriction  but  a property  of  the  optimal 
solution.  Either  way,  the  real  importance  lies  in  the  fact 
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that  it  enables  us  to  prove  theorem  (3:2.21).  This  theorem 
is  powerful  through  the  simplicity  of  its  conclusion.  It 
states  that  the  optimal  control  law  has  either  green  light 
for  all  stations  all  the  time  (All  Together)  or  green  light 
for  each  station  in  turn  (Round  Robin).  Before  we  embark 
upon  theorem  (3:2.21)  we  need  some  definitions  and  lemmas. 

i 

(3:2. IQ)  Definition*.  A station  is  said  to  be  of  age  b at 
the  beginning  of  a time  slot  if  the  start  of  its  last  time 
slot  with  green  light  was  b units  of  time  ago  (one  time  slot 
lasts  one  unit  of  time).  If  station  i is  of  age  b at  time  t 
then  this  implies  that  Xi(t)=1-( 1-p)b.  The  total  age  at 
^ time  t : b-p(t)  is  the  sum  of  the  ages  over  all  stations. 

The  total  age  sending  at  t : b$(t)  is  defined  as  the  sum  of 
the  ages  over  all  stations  that  have  Uj.(t)  = 1 (green  light). 

(3:2.11)  Lonan.  Restriction  (3:2.9)  implies  that  the  total 
age  sending  is  constant  and  equals  N. 

Proof.  Suppose  that  at  time  t there  are  n stations  with 
green  light  having  ages  b-|  (t) , . . . ,bn(  t) . Then  the 

probability  of  no  packet  leaving  the  system  in  the  t-th  time 

9 

slot  is  equal  to  the  product  of  the  probabilities  of  the 
sending  stations  having  an  empty  buffer: 

A 

qo(t)=jM (l-p)bJ(t)=(1-p)bS<t) 

Sinoe  qQ(t)  is  constant,  bg(t)  must  be  constant  as  well: 
bg(t)sb5.  The  evolution  of  the  total  age  of  the  system  can 
be  seen  to  be 
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b*r(t+1  ) = bT(t)+N-bs 

The  total  age  of  the  system  is  bounded  from  below  by  N and 
from  above  by  N*d,  where  the  latter  bound  comes  from  the 
assumption  made  in  theorem  (3:2.1).  The  only  way  that 
b*r(t),  t=1,...,T  can  stay  within  those  bounds  for  large  T, 
is  for  bs  = N,  Q.E.D. 

(3:2.12)  Lnn.  The  cost  function  (3:2.6)  can  alternatively 

be  written  as 

T 

(3:2.13)  C(u)*  Z c(t) 

in  which  c(t)  is  given  by 

c(t)  = (1+B)  Z t b 

j=1  J p 

+K[1-(1-p)N-jZ1(1-(1-p)bj)(1-p)N-bj] 
where  it  is  assumed  that  at  time  t there  are  n stations 
having  green  light  with  ages  bi,...,bn  respectively. 

Proof.  Both  in  (3:2.6)  and  in  (3:2.13)  the  cost  consists  of 
two  terms,  the  first  one  is  the  cost  of  having  a packet  and 
not  sending:  in  short,  the  cost  of  waiting  — this  term  has 

the  factor  (1+B).  The  second  term  is  the  cost  of  a single 
collision  K,  multiplied  by  the  probability  of  collision:  in 
short,  the  cost  of  collision.  The  difference  between 
(3:2.6)  and  (3:2.13)  is  a matter  of  accounting:  in  (3:2.6) 

the  cost  is  counted  for  e-uch  station  in  each  time  slot;  in 
(3:2.13)  we  count  in  a time  slot  the  cost  only  for  the 
stations  which  have  green  light  in  that  time  slot,  including 
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their  cost  of  waiting  during  previous  time  slots  that  was 
not  accounted  for.  A second  difference  is  that  in  (3:2.13) 
Xi(t)  is  calculated  explicitly  from  (3:2.5).  So  the  cost  of 
waiting  for  the  j-th  station  which  has  green  light  while  at 
age  bj  is 

<WB)[(1-(1-p> ) + ...  + ( 1-(1-p)bJ"1] 

= (1+B)[bj- 

P 

The  second  term  in  (3:2.13)  is  easily  recognized  by  noting 
that  the  probability  that  none  of  the  stations  with  green 


light  has  a packet  is 


(1-p)bS=(l-p)N 


and  the  probability  that  the  J-th  station  with  green  light 
is  the  only  one  which  has  a packet  is 

(1-(1-p)bJ)(1-p)N-bJ 

(3:2.14)  Lmm.  Among  all  age  distributions  {b-|,...,bn}  of 

the  total  age  sending  b§=N  the  only  two  that  can  possibly 

minimize  the  cost  (3:2.13)  are  those  with 

b i =b2= . . . • =bfj=  1 
or 

b-j  sN 

Proof.  Consider  any  other  age  distribution  say 

b1 i • • • *bn. 

For  it  to  be  different  from  the  two  mentioned  before  we  must 
have  1<n<N  and  for  at  least  one  j:  b j > 1 , say,  without  loss 
of  generality,  b i > 1 . We  shall  construct  two  variants  of  the 
age  distribution  under  consideration,  of  which  always  one 


J ffr  1 ' ' 


64 


will  give  strictly  lower  cost,  thereby  showing  that  the  age 
distribution  that  we  considered  can  not  be  optimal. 

Variant  1:  instead  of  1 station  of  age  b-|  and  1 of  age  b2 

we  have  b-|+b2  stations  of  age  1 : 

1 , 1 , . . . , 1 ,b3 , . . . ,bn 

Variant  2:  instead  of  1 station  of  age  b-|  and  one  station 

of  age  b2  we  have  1 station  of  age  b^+b2*. 

b1+&2»b3» . . . ,bn 

First  consider  variant  1.  Since  a station  of  age  1 has  no 
cost  waiting  we  find  that  the  decrease  in  the  cost  of 
waiting  is: 

(UB)[b,-  Mj-.g)--1.  .b2-  H’.-P>b2] 

P P 

(3:2.15)  = 2±®t(1-p)bUb1p-1  + (1-p)b2+b2P-1] 

P 

Also  from  (3:2.13)  we  find  that  for  variant  1 the  increase 
in  the  cost  of  collision  is 

K[(1-(1-p)b1)(l-p)l,-b1 

+(1.(1.p)b2)(1.p)N-b2.(bl+b2)p(1.p)N-1] 

after  some  manipulation  this  is 

=K(1-p)N"b1’b2[(i-p)b2(1_(1.p)b1.blp(i-p)b1-1) 

♦(1-p)b1(1-(1-p)b2-b2p(1-p)b2_1)] 

Now  use  the  following  inequality  (for  0<p<1  and  a2.1): 
1-(1-p)a-ap(1-p)a_1£(1-p)a+ap-1 
which  can  be  proven  by  showing  that  both  sides  are  0 for  p=0 
but  the  slope  of  the  right-hand  side  is  consistently  greater 
on  the  interval  (0,1).  Then  the  increase  in  cost  of 
collision  is  at  most 
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K(1-p)N-b1-b2[(i_p)b2((1.p)bl+blp.1) 

+( 1-p)bl ( ( 1-p)b2+b2p-1 ) ] 
which  is  in  turn  strictly  less  then 

(3:2.l6)K(1-p)N-b1-b2[(i-p)b1+b1p-1+(1-p)b2+b2P-1]. 

By  comparing  (3:2.15)  and  (3:2.16)  we  see  that  there  will  be 
improvement  if 

(3:2.17)  — lK(1-p)N_b1"b2 

P 

Now  consider  variant  2.  The  decrease  in  the  cost  of 
collision  is 

K[(l-(l-p)b1+b2)(1_p)N-b1-b2 

-(1-(1-p)b1)(1-p)N“bl-(1-(1-p)b2)(1-p)N-b2] 

(3:2.18)  =K( 1-p)N”b1“b2[ (1  — (1  — p)b1 )(1  — (1— p)b2)] 

the  increase  in  cost  of  waiting  is 

(1+B)[(1-(1-p)b1)+...+(l-(1-p)b1+b2-1 

-(l-d-pJ-.-.-d-d-p)^'1)] 

(3:2.19)  = ll®[(1-(1-p)b1)(1-(1-p)b2)] 

P 

comparing  (3:2.18)  and  (3:2.19)  we  see  that  there  will  be 
improvement  if 

(3:2.20)  — <K(1-p)N"b1_b2 

P 

It  follows  immediately  from  (3:2.17)  and  (3:2.20)  that 
always  at  least  one  of  the  variants  gives  lower  cost,  which 
now  proves  the  lemma. 

(3:2.21)  Tbmor— . The  optimal control  law  for  ’new' 


packets,  depending  on  the  value  of  p,  is  to  have  green  light 
for  all  stations  all  the  time: 
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W / Ui(t)=1  ¥ift  if  p<p0  (All  Together) 

or  to  have  green  light  for  each  station  in  turn: 

ui(t)  = 1 for  tmod  fj  =i-1  if  p>.p0(Round  Robin) 

where  p0  is  the  unique  solution  on  the  interval  (0,1)  of 

(3:2.22)  (UB)[H-i?-(.l~.P)  -)]=K[1-(1-p)N-Np(1-p)N-1] 

P 

Proof.  From  theorems  (3:2.1)  and  (3:2.3)  we  know  that  the 
optimal  control  law  is  a predetermined  pattern  of  0's  and 
I’s.  From  lemma  (3:2.14)  we  know  that  we  should  have  (if 
feasible)  in  each  time  slot  either  N stations  of  age  1 with 
green  light  or  1 station  of  age  N.  Using  lemma  (3:2.12), 
the  cost  per  stage  is  then  respectively 

K[1-(1-p)N-Np(1-p)N“1] 

or 

( 1+B) [N-  1~(1~P)N] 

P 

The  theorem  states  that  we  should  take  whichever  is  the 
cheapest  of  those  two,  for  given  p,  K,  and  B.  This  choice 
is  feasible  if  we  choose  the  proper  initial  conditions, 
namely  at  t=1  station  i is  of  age  1 Vi  if  p<pQ,  or  at  t=1 
station  i is  of  age  N+1-i  if  plp0.  There  will  be  little 
objection  to  choosing  initial  conditions  at  convenience,  if 
we  note  that  the  initial  conditions  are  'erased*  as  soon  as 
every  station  has  had  green  light  once  and  that  we  are 
interested  in  solutions  for  large  T. 
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(3:2.23)  laaartt.  An  approximate  solution  to  (3:2.22)  which 

will  be  a better  approximation  as  the  resulting  pQ  becomes 

smaller  and  N larger,  is  given  by 

(3:2.2%)  Po5  hi 

K 

This  value  for  pQ  can  be  explained  as  follows:  Consider  a 

station  that  has  a packet  and  suppose  the  stations  operate 
in  All  Together  mode;  then  the  station  will  send  the  packet 
right  away.  It  estimates  the  probability  that  another 

station  has  a packet  to  be  Np,  hence  with  probability  Np 
there  will  be  a collision  of  cost  K (assuming  that  the  other 
stations  operate  in  the  same  mode  and  neglecting  the 
probability  of  collisions  with  3 or  more  packets  involved). 
Therefore  the  expected  cost  of  collision  per 
station-with-packet  is  KNp/2.  Now  suppose  that  the  stations 
operate  in  Round  Robin  mode,  then  the  station  with  a packet 
has  to  wait  on  average  N/2  time  slots.  Therefore  the 
average  cost  of  waiting  is  (1+B)N/2.  So  p0  is  just  at  the 
crossover  point  where  the  "cost  of  collision  becomes  greater 
than  the  cost  of  waiting". 
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3:3  Optimal  Control  Law  for  Collided  Packets. 

In  this  section  the  problem  of  what  to  do  after  a 
collision  is  solved  as  a separate  problem.  This  corresponds 
to  assuming  that  as  soon  as  the  echo  from  a collision  comes 
back,  all  stations  are  blocked  for  new  packets  until  the 
collision  is  resolved  in  a sense  to  be  made  precise  below. 
Imposing  such  a separation  between  new  packets  and  collided 
packets  enables  us  to  solve  the  overall  problem  in  a nice 
way,  however  it  may  introduce  sub-optimality  at  the  same 
time.  Specifically,  sub-optimality  may  arise  from  the  build 
up  of  a backlog  of  new  packets  during  the  time  the  channel 
is  used  for  retransmissions.  When  the  system  returns  to  All 
Together  mode,  the  probability  of  a new  collision  is  higher 
then  was  accounted  for.  For  very  small  values  of  p,  this 
sub-optimality  will  be  negligible.  For  larger  values  of  p, 
the  system  operates  in  Round  Robin  mode  — therefore  no 
retransmissions  of  collided  packets  take  place  and  there  is 
no  sub-optimality.  It  is  left  for  further  research  to  solve 
the  integrated  problem  for  intermediate  values  of  p. 
Possibly  it  can  be  shown  that  for  these  intermediate  values 
the  optimal  mode  is  Round  Robin  to. 

(3:3*1)  Model.  Suppose  in  time  slot  tc  (i.e.  the  time  slot 
that  started  at  t=tc)  the  set  of  stations  which  were 
transmitting  a packet  {ij ,i2i • • • ip)  has  two  or  more 
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elements.  Then  in  time  slot  tc+d-1  the  echo  of  a collision 

comes  back  to  all  stations.  In  response  to  this  collision 

all  stations  refuse  a possible  arrival  during  time  slot  tc+d 

and  remain  blocked  until  the  collision  is  resolved,  the 

stations  that  were  involved  in  the  collision  place  their 

collided  packet  in  their  buffer  (it  is  assumed  that  all 

stations  keep  somewhere  copies  of  transmitted  packets  until 

the  transmission  is  known  to  be  successful).  If  we  'reset' 

time  such  that  tstc+d  becomes  t=1  then  we  have  the  following 

initial  -ggnllUttaa: 

Xid)=i  if  ie{ilti2 ir) 

=0  otherwise 

where  { i^ ,i2f . . • , ip)  is  the  set  of  success-indices  of  a 
sequence  of  N Bernoulli  trials  with  probability  of  success 
p.  Note  that  the  initial  conditions  form  a random  vector  of 
which  each  station  only  knows  its  own  component.  Since 
there  are  no  arrivals  the  dynamics  are  simply  given  by 
(3:3-2)  xi(U1)s(1-ui(t))xi(t) 

and  the  coat  t for  given  controls  uj.(t)  i=1,...,N, 

t=1,...,T2,  similar  to  (3:1*6)  is 

T2  N 

(3:3.3)  K*E{  Z l.^1Xi(t)(1-ui(t))+NB 

N N N 

♦K2[1-in  (1-xi(t)ui(t))-iZixi(t)u1(t)jyi(1-xi(t)ui(t))]}} 

where  K2  is  the  cost  of  creating  a second  collision. 

Clearly  as  soon  as  every  station  has  had  ui(ti)=1  once  for 

some  t^<t  then  x^(t)aO  Vi  and  the  collision  is  resolved 
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(there  may  be  a second  collision  but  at  this  point  we  assume 
that  that  is  taken  care  of  at  cost  K2).  Therefore  T2,  the 
time  needed  for  a second  transmission,  is  defined  by 
T2  = max(t!  Xi(t)si  for  some  i} 

The  problem,  of  course,  is  to  choose  the  controls 
Ui(t)  such  as  to  minimize  the  cost.  The  approach  to  the 
problem  for  collided  packets  parallels  the  approach  in  the 
preceding  sections.  First  we  note  that  the  problem  is  not 
fully  defined  until  we  specify  what  the  information  is  on 
which  controls  can  be  based.  The  following  theorem  shows 
that  the  little  information  available,  is  of  no  use. 

(3:3.*)  Ttaeora*.  The  controls  u^(t)  that  minimize  the  cost 
are  predetermined  (open  loop):  they  will  be  independent  of 
the  information  that  each  station  has  about  the  random 
vector  which  determines  the  initial  conditions. 

Proof.  Besides  implicit  information  such  as  the  probability 
distribution  of  the  initial  condition,  the  only  information 
available  to  station  i is  x^d).  However  if  Xi(1)=0  then 
the  next  states  (of  station  i)  as  well  as  the  cost  are 
independent  of  the  controls  u^tt),  ts1,...,T2.  Therefore 
the  controls  can  be  taken  the  same  as  when  x^djsl,  which 
implies  that  u^(t),  t=1,...,T2  will  be  independent  of 

station's  involvement  in  the  collision. 
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(3:3.5)  I— rit.  We  shall  not  consider  randomized  controls. 
They  are  excluded  here  too  for  very  much  the  same  reasons  as 
given  in  theorem  (3:2.3). 

In  model  (3:3-1)  there  is  still  one  point  that  has 
not  been  fully  described,  namely  the  cost  for  a second 
collision  K2.  It  was  assumed  that  K2  was  a given  quantity, 
but  to  determine  K2  we  would  have  to  solve  a separate 
problem  to  find  a parameter  K3:  the  cost  for  a third 
collision.  This  could  lead  to  a long  chain  K,K2,K3,.... 
The  following  restriction  is  introduced  to  break  the  chain 
right  after  K2. 

(3:3.6)  Baa tr lot ion.  The  controls  must  be  chosen  such  that 
for  any  packet  there  will  be  no  third  collision. 

The  main  motivation  is  that  this  restriction  enables  U3  to 
find,  without  too  much  complexity,  an  expression  for  the 
cost  of  collision  K.  But  there  are  other  benefits  from  this 
restriction,  namely  that  it  reduces  the  variance  in  delay 
and  the  maximum  delay.  Of  course  by  doing  more  work  one 
could  have  lesser  restrictions  of  no  fourth  collision,  no 
fifth  collision,  and  so  on  up  to  no  N-th  collision  (beyond 
that,  the  restriction  is  meaningless),  however  this  does  not 
seem  worth  the  effort. 

(3:3.7)  Problaa.  From  the  dynamics  (3:3-2)  it  is  clear  that 
for  the  second  transmission  each  station  needs  only  one  time 
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slot  with  green  light.  From  theorem  (3:3.4)  it  follows  that 

* 

the  assignment  of  a station  to  one  of  the  time  slots 
1,2, ...fT2  can  be  predetermined  and  from  the  'equality'  of 
all  stations  it  follows  that  only  the  number  of  stations 
assigned  to  a time  slot  matters.  Restriction  (3:3.6) 

dictates  that  whenever  a second  collision  occurs  among  a 
group  of  stations  which  were  assigned  to  one  time  slot  then 
for  the  third  transmission  that  group  must  do  'Round  Robin'. 
So  at  this  point  the  problem  can  be  formulated  as:  find  the 

grouping 

m( 1 ) ,m(2) , . . . ,m(T2) 

T2 

such  that  Z m(t)=N 
t=1 

that  minimizes  the  cost 

* m 

K=tZ^  k(m(t),t) 

where  m(t)  is  the  number  of  stations  with  green  light  during 
time  slot  t (of  the  second  transmission)  and  K is  the  total 
cost,  measured  in  units  of  delay,  incurred  by  the  system  as 
a result  of  a collision  in  a first  transmission.  K can  be 
calculated  per  group,  with  the  cost  of  a group  k(m(t),t)  as 
given  in  the  next  lemma. 

(3:3>8)  Loan.  Suppose  a collision  has  occurred  and  let  a 
group  of  m stations  be  assigned  to  the  t-th  time  slot  of  the 
second  transmissions.  Then  the  cost  of  collision  that  can 
be  attributed  to  that  group  is  (for  m>0) 

k(m,t)s(d+t-1 ) — l"p* L--  +NB 

1-( 1-p)H-Np( 1-p)N 
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+ (d+  > -- 

2 1-(1-p)N-Np(1-p)N”' 

Proof.  According  to  the  model  (3:3.1)  it  is  determined  by  a 
sequence  of  N Bernoulli  trials  which  station  has  a packet 
(success)  and  which  has  not.  The  assumption  that  a 
collision  occurred  means  that  all  probabilities  are 
conditioned  on  'at  least  two  successes',  abbreviated  as 
'coll.'.  Define  n as  the  number  of  stations,  within  the 
group  of  m stations.  That  have  a packet,  then  the  collision 
cost  for  the  group  can  be  written  as 

k(m, t)= (d+t-1 )E{ni coll. ) +NB 

+ (d-f  — )E{n!nl2}Pr{ni2icoll. } 

2 

+mNB  Pr{n£.2  ! coll. } 

and  explained,  term  by  term,  as  follows.  The  first  term 
gives  the  expected  amount  of  waiting  in  the  group  from  the 
first  (unsuccessful)  transmission  to  the  second  transmission 
for  that  group.  During  second  transmissions  all  N stations 
are  blocked  for  new  packets,  this  adds  a term  NB  per  group. 
With  probability  Pr{ n^.2 1 coll. } the  group  has  two  or  more 
stations  with  a packet  in  which  case  the  second  transmission 
is  unsuccessful.  Then  there  is  more  waiting:  per  packet 

this  is  one  round  trip  time  d plus  waiting  for  its  turn  when 
doing  Round  Robin  among  the  group  members  (on  average 
and  there  is  more  blocking:  during  m time  slots  of  the 


(1-( 1-p)m-mp( 1-p)m" 


1-O-p)  -Np(l-p)1 
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third  transmissions  all  N stations  are  blocked.  By 

elementary  probability  theory  we  find 

m 

E{ n i coll . } = Z JPr{n=j!coll.} 

J " ' 

=Pr{n=1 icoll. } + Z j Pp{n=j} 

J=2  Pr{coll.} 

. Pr{n= l&coll. } + ® JPr{n= j}  _ Pr{n=1} 

Pr{coll. } j=1Pr{coll.}  Pr{coll.} 

_ mp(  l-p)10"1  ( 1-( 1-p)N~m)+mp-mp(  l-p)10-1 
Pr{coll. } 

=mp 

1 - ( 1-p)N+Np(1-p)N*1 

and 

E{n!nl2}  = Z j££iEiii 
J=2  Pr{nl2} 

= Z Jprtn=j}  _ Pr{ n-1 } 

J = 1Pr{nl2}  Pr  { n^.2 } 

_ mp(  1-(  l-p)10"1 ) 

1-(1-p)m-mp(1-p)m‘1 

and 

Pr{  n^.2  ! coll . } = Pr{n2-2} 

Pr{coll. } 

= 1-(1-p)°-mp(1-p)m~1 

T-( 1-p)N-Np( 1-p)^-1 

(3:3.9)  laurk.  In  the  preceding  lemma  it  was  implicitly 

assumed  that  there  is  HQ interference  between 

retransmissions.  E.g.,  in  the  (unlikely)  case  that  a 
collision  of  4 or  more  new  packets  occurred  and  that  in  each 
of  two  groups  (say,  the  groups  with  m(1)  and  m(2)  members 
respectively)  there  are  2 or  more  stations  that  have  a 
packet.  Then  we  neglect  the  interference  which  arises  in 
the  2 overlapping  intervals  of  third  transmissions: 
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[tc+2d,tc+2d+m( 1 )-1 ] and  [tc+2d+1 ,tc+2d+m(2) ] . 


(3:3-10)  Algorithm.  The  solution  to  problem  (3:3.7)  can  now 

easily  be  given  in  an  algorithmic  form  with  the  approach  of 

dynamic  programming  [13].  The  algorithm  explains  itself, 

with  the  following  definition  in  mind.  Define  J(n,t)  to  be 

the  cost  when  n stations  are  assigned  optimally  to  the  first 

t time  slots  of  second  transmissions.  (Note  that  the 

meaning  of  the  symbol  n does  not  correspond  to  its  previous 

meaning.)  Further,  define  k(0,t)=0  V,t. 

======== initial ize== ====== 

for  n= 1 to  N 

J(n, 1 )=k( n , 1 ) 

next  n 
for  all  t 

M(N,t)*0 

next  t 
t=1 

====== ==recursion========= 

for  t=t+1  until  M(N,t)=0 
for  n= 1 to  N 

J(n,t)=m*n{k(m,t)+J(n-m,t-1 ) } 
M(n,t)=arg{m*n{k(m,t)+J(n-m,t-1 ) } } 

next  n 

next  t 

====== ==results= ========== 

T2=t-1 

K=J(N,T2) 

m=N 

for  t=T2  to  1 step  -1 
m(t)sN(m,t) 
m=m-m( t) 

next  t 


(3:3.11)  CoMlHiou.  In  the  preceding  sections  we  looked  at 
the  problem  of  access  control  to  a packet  switched  satellite 
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broadcast  channel  in  the  framework  of  non-classical  control 
theory.  This  viewpoint,  with  the  simple  model  presented, 
enabled  us  to  derive  a nice  solution  that  differs 
interestingly  from  other  approaches  to  the  same  problem 
[14],  [15].  In  particular  it  is  shown,  by  a result  of 
Bayesian  decision  theory,  that  randomized  decisions  will  not 
be  necessary.  Further  we  found  that,  under  appropriate 
assumptions,  for  'new*  packets  only  two  modes  of  operation 
of  the  channel  could  be  optimal,  namely:  'All  Together' 
(every  station  sends  whenever  it  has  a packet)  or  'Round 
Robin'  (every  station  sends  whenever  it  is  its  turn  and  it 
has  a packet),  and  an  equation  for  the  crossover  arrival 
probability  p0  at  which  the  channel  would  switch  from  All 
Together  (p<p0)  to  Round  Robin  (plp0)  is  given.  For  collided 
packets  it  was  found  that  it  is  optimal  to  have 
predetermined  assignments  of  groups  of  stations  to  time 
slots  for  retransmissions,  and  a dynamic  programming 
algorithm  for  determining  the  optimal  grouping  is  given. 

Of  course  to  get  these  results  a number  of  assumptions  had 
to  be  made.  Some  of  the  most  restrictive  assumptions  are: 
a.)  all  stations  have  an  equal  arrival  probability  p,  b.) 
The  arrival  probabilities  stay  constant  over  time,  c.)  The 
stations  have  only  a single  buffer  for  packets  that  are 
ready  to  be  transmitted.  Nevertheless  we  believe  the  results 
of  this  chapter  to  be  useful,  also  in  more  general 


situations  that  do  not  comply  with  the  restrictions 
mentioned.  It  is  currently  being  investigated  as  to  how 
(maybe  in  a heuristic  way)  the  results  of  this  chapter  can 
be  applied  to  more  general  situations.  Specifically  we  have 
the  following  in  mind:  a.)  When  stations  have  different 
arrival  rates  then  they  could  be  divided  in  two  classes; 
one  All  Together  class  for  stations  with  small  arrival 
probability,  and  one  Round  Robin  class  for  stations  with 
high  arrival  probability,  with  time  slots  divided  between 
the  two  classes,  b.)  When  arrival  probabilities  vary  over 
time,  it  will  be  reasonable  to  assume  that  the  time  between 
variations  of  the  arrival  probabilities  is  much  longer  then 
one  round  trip  time  to  the  satellite.  In  this  case  the 
stations  could  report  periodically  what  their  arrival 
probability  is  and  on  the  basis  of  those  reports  the 
stations  could  agree  on  what  presently  the  proper  mode  of 
operation  will  be.  c.)  When  in  each  station,  besides  the 
single  buffer,  there  are  auxiliary  buffers  in  which  packets 
that  would  otherwise  be  refused  are  placed,  then  additional 
delay  is  incurred  by  packets  in  the  auxiliary  buffer.  We 
purposely  carried  the  factor  B (cost  of  the  single  buffer 
being  blocked)  through  all  of  the  analyses,  because  in  B one 
could  summarize  all  the  cost  of  waiting  that  is  incurred  in 
the  auxiliary  buffer. 


■*»  » 
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Other  points  for  further  investigation  are  the  performance 
and  stability  of  the  channel. 
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CHAPTER  4:  INFORMATION  POLICIES 
FOR  ROUTING  IN  A COMPUTER  NETWORK 

Introduction  and  Definitions 

Packet  switched  computer  networks  have  already 
proven  to  be  a useful  combination  of  computer  systems  and 
communication  facilities,  and  their  role  in  electronic  data 
processing  will  become  more  and  more  important  [16],  [17]. 
Problems  in  the  technical  operation  of  a computer  network 
range  from  of  reliable  transmission  over  the  communication 
line  (modulation,  demodulation,  hardware  error  correction) 
to  the  communication  between  computers  that  use  the 
communication  network  (host-to-host  communication  protocol). 
At  an  intermediate  level  lies  the  problem  of  how  to  route  a 
packet  of  digital  information  with  a given  destination 
through  the  network.  The  decisions  as  to  what  route  will  be 
taken  by  the  packet  at  a node  in  the  network  can  be  made  by 
the  communication  computer  at  that  node.  This  is  known  as 
distributed  routing  [18].  The  nodes  are  geographically 
separated  and  the  information  which  is  exchanged  between  the 
communication  computers  for  the  purpose  of  making  better 
routing  decisions  travels  on  the  same  network  that  is  to  be 
controlled.  These  facts  make  the  routing  problem  fit  within 
the  framework  of  non-classical  control  theory  (i.e.  more 
than  one  decision  maker).  The  routing  problem  has  already 
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been  considered  within  the  framework  of  control  theory  by  a 
number  of  authors  (e.g.  [19]  or  [20]).  The  approach  taken 
in  this  chapter  is  somewhat  different  in  the  sense  that  we 
will  concentrate  on  the  underlying  problem  of  what  is  the 
"best"  information  on  which  to  base  the  decisions.  With 
regard  to  the  question  of  what  are  the  "best"  decisions  we 
shall  see  later  in  this  section  that  — given  a suitable 
objective — there  is  an  obvious  answer.  First  we  shall 
specify  the  routing  problem  more  precisely  in  the  following 
paragraphs. 

(4:1.1)  Definitions.  A network  is  a set  of  nodes 

H={1,2 N} 

together  with  a set  of  links  between  ordered  pairs  of  nodes 
{ ( i i » j i ) > • • • , ( ij_, , Jl  ) i *n » J n®N } 

The  node  in  is  called  the  begin  point  of  the  link  (in»Jn)» 
and  correspondingly  node  jn  is  the  end  point  of  that  link. 
At  each  node  i there  is  a source  that  generates  packets  with 
destinations  jefi  and  there  is  a sink  that  absorbs  all 
packets  with  destination  i.  Packets  can  travel  along  links. 
The  first  part  of  this  definition  is  the  usual  definition  of 
a directed  graph.  We  assume  that  (j,i)eA,  whenever  (i,j)eA, 
that  is  the  lines  are  full  duplex.  This  assumption  is  not 
essential  for  the  subsequent  analysis  but  is  made  for 
simplicity. 
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A routing  decision  made  by  node  i for  a packet  with 
destination  kafi  is  an  assignment  of  packets  to  one  of  the 
links  that  have  i as  begin  point.  The  routing  decision  will 
depend  on  the  state  of  the  network.  A criterion  for 
deciding  which  decision  is  "best"  is  the  time  needed  for  the 
packet  to  travel  from  source  to  destination.  The  time  needed 
is  obviously  the  sum  of  the  delays  on  each  of  the  "hops" 
lying  on  the  path  of  the  the  packet  (the  packet  flows 
through  one  hop  when  it  flows  through  one  node  and  one 
link).  The  state  of  the  network  is  the  vector  D1  of  all 
1-hop  delays  in  the  network  (one  element  for  each  link).  The 
one-hop  delay  from  node  i to  node  j,  djj,  is  the  sum  of  the 
following  terms: 

(4:1.2)  djj  = processing  time  for  the  packet  at  node  i 
physical  distance  between  i and  j 
speed  of  light  along  the  path 
+ packet  length 
channel  capacity 

Z packet  lengths  of  predecessors  in  queuen 
channel  capacity 

In  the  definition  of  3|j  it  is  assumed  that  the  packet 
always  gets  to  node  j.  In  practice  this  is  not  true, 
sometimes  the  packet  is  lost  due  to  transmission  errors  or 
cannot  be  received.  One  scheme  to  deal  with  such  events  is 
to  acknowledge  each  well  received  packet.  If  no 
acknowledgement  comes  back,  the  packet  has  to  be 


retransmitted  (this  is  the  ARPA  Network  scheme).  It  is  not 
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hard  to  include  the  expected  delay  due  to  possible 
retransmissions.  Let  p be  the  probability  that  a 

retransmission  is  necessary  and  let  r be  the  maximum  time 
that  a node  waits  for  an  acknowledgement  from  a neighbor 
node.  The  expected  number  of  retransmissions  is 

f kpk(1-p)  = JSL  , 

K-  I 1 -P 

and  this  adds  a term 

O],  j+r)— £— 

1-p 

to  the  delay,  assuming  that  r starts  after  3]j  is  elapsed 
and  that  the  retransmission  packet  experiences  the  same 

delay,  3]j,  as  a new  packet.  From  now  on  we  shall  include 
the  correction  due  to  possible  retransmissions  in  the 
one-hop  delay.  Thus  the  one-hop  delay  is  redefined  as 

(*:1.3>  djj  = dlj-tdlj-ri-p-E-  = ( JL)dlj+(-E-)  rj.  j 

1-p  1-p  1-p 

where  is  as  given  in  equation  (4:1.2),  p is  the 

probability  that  a retransmission  is  necessary  (may  depend 
on  ij)  and  r^j  is  the  time  waited  before  retransmitting. 

(4:1.*)  iMarka.  In  the  definition  of  the  one-hop  delay  we 
considered  the  output  queues  of  packets  that  were  waiting  at 
node  i to  go  on  the  link  to  i's  neighbor  j,  but  not 
explicitly  the  input  queue  at  node  i of  packets  that  are 
waiting  for  a routing  decision  to  be  made  and  for  other 
processing  required  for  newly  arrived  packets.  This  queue 
can  be  modeled  in  the  same  way  as  the  output  queues  will  be 
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modeled,  without  any  additional  difficulty.  The  delay  in 
the  input  queue  can  be  added  to  the  delays  of  each  of  the 
output  queues.  For  the  sake  of  simplicity  this  term  will  be 
left  out  of  our  subsequent  analysis. 

The  one-hop  delay  as  given  in  (4:1.2)— (4:1.3)  contains  a 
number  of  terms  which  depend  on  the  particular  link  (i,j) 
and  may  vary  with  time  or  with  every  packet.  These  are  rjj, 
Pij,  packet  length  |3,  channel  capacity  Cjj  and  processing 
time  Hj.  Again  for  ease  of  exposition,  we  restrict 
ourselves  for  most  of  this  paper  to  the  case  where  the  only 
variable  is  Qjjf  the  number  of  seconds  of  queueing  delay  in 
queuejj,  which  implies  fixed  packet  length  (3  and  constant 
rij»  Pij>  cij  and  fci)  for  ®ach  link.  This  means  that  d}j  can 
be  thought  of  a3 

(4:1.5)  d|j  = (kij+qij)klj 

f 

where  kjj  and  kjj  are  constants.  However  in  remark  (4:2.5) 
we  give  formulas  to  include  variable  packet  length  when  the 
distribution  over  the  different  possible  packet  lengths  is 
given.  And  in  remark  (4:2.15)  we  indicate  how  a node  can 
update  from  time  to  time  its  estimated  values  for  the 
distribution  of  packet  length,  rjj,  pij,  cjj,  and  possibly 
tjj  under  the  assumption  that  they  change  at  a much  slower 
rate  than  q^j. 

(4:1.6)  Definition*.  For  conceptual  and  notational 
convenience  we  can  arrange  the  state  vector  of  one-hop 
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delays  into  a matrix  of  one-hop  delays,  which  we  denote  by 
the  same  symbol  D1,  in  which  the  element  (i,j)  gives  the 
one-hop  delay  from  i to  j.  The  diagonal  elements  of  this 
matrix  are  defined  to  be  0 and  the  pairs  (i,j)0A  have 
corresponding  entry  co  . 

The  min-Stim  C of  two  matrices  A and  B of  compatible 
dimensions,  denoted  by  A~B  is  given  by 

(C)ik  = mjn{(A)ij+(B)jk} 

It  can  be  seen  directly  from  this  definition  that  when  B is 
a matrix  of  minimum  (p-1)  - hop  delays  and  A is  a matrix  of 
1-hop  delays  then  C will  be  the  matrix  of  minimum  p-hop 
delays. 

M M 

The  matrix  of  minimum  delays  is  given  by  D=D  where  D is 

defined  recursively  through  D^D^D15-1  and  M is  the  maximum 

M 

hop  distance  in  the  network.  Note  that  although  D contains 
M-hop  delays  only,  because  of  the  zeros  on  the  diagonal  of 
D1,  less-than-M-hop  delays  are  essentially  included. 

So  when  a node  knows  the  state  vector  D1  precisely  it  can 
compute  the  matrix  of  minimum  delays  D.  However  the  matrix  D 
tells  a node  only  what  the  minimum  delay  is,  not  along  which 
path  the  minimum  delay  is  attained.  A node  needs  another 
matrix  on  which  to  base  its  decisions:  the  node  delay  table. 
The  node  delay  table  of  node  i,  denoted  , has  as  element 
(j,k)  the  minimum  delay  from  i to  k via  j where  j is  a 
neighbor  of  i (i.e.,  (i,j)6A).  The  node  delay  table  can  be 
computed  from  D1  and  D namely  (D^) jk= (D1 ) ± j+(D) jk. 
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The  definitions  above  lead  up  to  an  obvious 
decision  rule,  which  is  given  in  the  next  assertion, 
provided  we  assume  that  the  node  delay  table  is  accurate  and 
that  the  objective  is  to  minimize  delay  for  an  individual 
packet . 


(4:1.7)  Assertion.  The  optimal  decision  for  minimizing 
delay  for  an  individual  packet,  under  the  assumption  of 
perfect  information,  is  to  send  the  packet  along  the  minimum 
delay  path  as  given  by  the  node  delay  table.  We  call  this 
the  minimum  delay  decision  rule. 

(4:1.8)  leurk.  The  assumption a£ is 

rather  strong.  It  requires  that  the  decision  maker,  say  node 
i,  knows  the  state  of  the  network  D1  precisely  not  only  at 
the  time  that  the  decision  is  made  but  also  ahead  of  time 
for  the  duration  of  the  flight  of  the  packet  to  its 
destination.  This  means  that  we  must  either  assume  that 
everything  in  the  network  remains  constant,  except  for  node 
i's  decision,  or  that  node  i can  anticipate  all  the  changes 
in  the  delays,  which  will  be  experienced  by  the  packet,  due 
to  decisions  of  other  nodes.  Obviously  neither  assumption  is 
« ever  going  to  be  satisfied.  In  the  next  section  this  strong 

assumption  will  be  relaxed  with  our  choice  to  let  the  the 
information  be  not  the  actual  values  of  the  one-hop  delays 
but  the  expected  values  of  the  delays.  Basing  the  minimum 


86 


delay  decision  rule  on  the  latter  information  changes  the 
objective  from  minimizing  delay  to  minimizing  expected 
delay.  Nevertheless,  it  is  worthwhile  to  consider  first  the 
assumption  of  perfect  information  because  it  suggests  the 
simple  minimum  delay  decision  rule,  which  we  shall  assume  to 
be  in  effect  for  the  rest  of  our  analysis.  Our  primary 
concern  is  to  determine  the  information  on  which  to  base  the 
decisions. 


4 
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4:2  Choosing  an  Information  Policy. 

In  designing  the  information  policy,  there  is  a 

* sequence  of  decisions  that  must  be  made  about  the  nature  of 
the  information  to  be  collected,  and  the  structure  of  the 

i 

communication  policy  for  distributing  the  information.  We 
shall  describe  each  decision  in  turn,  indicating  which 
choice  we  make  in  the  design  presented  here. 

As  pointed  out  in  remark  (4:1.8)  the  assumption  of 
perfect  information  is  rather  strong  and  cannot  be  satisfied 
in  practice  because  the  delays  at  the  nodes  change  much 
faster  than  information  can  travel  through  the  network. 

(Note  that  this  is  fundamental  to  packet  switched  networks 
since  the  control  information  flows  on  the  same  channel  as 
the  data  and  should  take  only  a fraction  of  the  channel. 

This  is  not  the  case  in  other  applications  such  as  vehicle 

traffic.)  A particular  delay  will  here  be  modeled  as  a 

constant  plus  the  queueing  delay  in  a queue  with  Poisson 

arrivals  of  intensity  \(t).  The  parameter  of  the  Poisson 

process,  \(t),  will  be  modeled  as  a jump  process,  where  the 
time  between  successive  jumps  is  comparable  with  the  delays 

* in  the  network.  The  parameters  of  the  jump  process  will  be 

considered  constant.  On  the  basis  of  this  cascade  of 
stochastic  processes  we  can  define  different  classes  of 

9 

information  policies.  The  choice  among  these  classes 

represents  our  first  major  design  choice. 
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(4:2. 1)  Definition.  An  information  policy  in  our  context  is 
said  to  be  of  Class  0 if  the  information  that  the  nodes 
exchange  are  the  actual  delays,  either  the  actual  delay 
itself  or  the  queue  length  from  which  the  the  corresponding 
one-hop  delay  can  be  determined  according  to 
(4: 1.2)— (4: 1.3).  An  information  policy  is  of  Class  I if  the 
information  exchanged  is  the  parameters  of  the  Poisson 
processes  or  the  corresponding  expected  delays.  An 
information  policy  is  said  to  be  of  Class  II  if  the 
information  is  the  parameters  of  the  jump  processes  or  the 
corresponding  long  term  expected  delay. 

(4:2.2)  Choice.  (Information  class.)  In  section  4:3  we  shall 
discuss  why  we  believe  Class  I to  be  the  best  choice  and  we 
continue  this  section  under  that  assumption. 

The  next  design  choice  to  make  is  how  to  model  the 
stochastic  process  at  a node  which  is  responsible  for  the 
delay.  As  indicated  in  remark  (4:1.4)  we  shall  pursue  in 
depth  the  case  in  which  variations  in  delay  are  caused  only 
by  variations  in  the  number  of  packets  in  the  corresponding 
queue.  Extensions  to  the  case  of  variable  packet  length  as 
given  in  remark  (4:2.5)  and  to  the  case  of  variable 
probability  of  retransmission,  p,  retransmission  time,  r, 
and  other  terms  contributing  to  the  delay  as  given  in  remark 
(4:2.15)  are  straightforward  and  do  not  alter  the 
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information  policy  significantly.  This  justifies  choosing  a 
queueing  model  for  the  delay.  This  model  has  the  additional 
advantage  that  it  has  as  a natural  description  the  Poisson 
model,  which  requires  estimating  only  a single  parameter, 
from  which  delay  can  be  predicted. 

(4:2.3)  Choice.  (Queueing  model.)  At  each  node  there  are 

several  queues  - one  for  each  outgoing  link  and  one  input 

queue  (cf.  remark  (4:1.4)).  These  queues  can  be  modeled 

individually  as  a set  of  independent  queues  or  as  a much 

more  complex  system  of  dependent  queues.  We  choose  the 

former  approach  for  simplicity.  The  arrivals  in  an 

individual  queue  are  assumed  to  form  a Poisson  process  with 

intensity  X(t).  This  choice  is  made  because:  a)  packets 

arriving  in  the  queue  come  from  a (possibly  large)  number  of 

independent  sources,  b)  the  collection  of  arrival  patterns 

that  can  be  modeled  as  a Poisson  process  with  time  varying 

parameter  is  very  rich.  The  assumption  of  fixed  packet 

length  implies  a deterministic  service  time  for  each  packet 

say  p seconds  (i.e.,  we  measure  packet  length  in  seconds 

needed  to  go  on  the  line,  rather  than  bits).  The  model  we 

have  now  is  that  of  a M/D/1  (Poisson  arrivals/  deterministic 

service  time/  1 server)  queue.  Let  Q(t)  denote  the  queueing 

delay  in  seconds  at  time  t.  For  \(t)  constant  the  steady 

state  expected  queueing  delay  is  given  by 

(4:2.4)  iioo  E { Q ( t ) ! X(t)  = X)  = — — for  X 6<  1 

W 2(1-Xp) 
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which  is  a special  case  of  the  next  formula  (4:2.6). 
Equation  (4:2.4)  enables  us  to  translate  the  parameter  of 
the  Poisson  process  into  expected  delay,  which  gives  us  an 
obvious  way  of  applying  the  minimum  delay  decision  rule  on 
Class  I information.  Note  that  since  (4:2.4)  gives  the 
steady  state  value  it  is  "predictive",  which  may  compensate 
for  information  delays. 


(4:2.5)  Remark.  Suppose  that  the  packet  length  is  variable 

and  takes  the  value  1^  with  probability  p^  (k=1,2 K), 

independent  of  any  previous  packet  length,  where  length  is 
measured  in  time  needed  to  put  a packet  on  a outgoing  line. 
Let  \(t)  again  be  the  intensity  of  the  Poisson  process  of 
arrivals.  Then  the  expected  steady  state  queueing  delay  is 
given  by: 

(4:2.6)  E{Q(t)i  X(t)-X)  = for  XB<1 

2(1-xp)  r 

K 


where 

and 

This  is  a standard 
arrivals/  General 
and  can  be  found  in 


(3  = ^Z^pjjljt  (average  packet  length) 

02=  j^Pkll  (second  moment) 
result  for  the  M/G/1  queue  (Poisson 
distribution  of  service  time/  1 server) 
e.g.  [21],  Ch.4. 


(4:2.7)  taark.  It  is  interesting  to  compare  (4:2.4)  and 
(4:2.6)  with  the,  expected  steady  state  delay  in  a queue  with 
Poisson  arrivals  and  negative  exponential  service  time.  This 
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is  the  M/M/1  queue,  which  is  used  by  most  authors  who  apply 
queueing  theory  to  computer  networks.  For  a M/M/1  queue  we 
have: 

2 

(4:2.8)  ii£E{Q(t)!  X( t ) =X } = _A21__  for  xB<1 

^ 2( 1-xp)  K 

Formulas  (4:2.4),  (4:2.6)  and  (4:2.8)  give  successively  more 

conservative  figures  for  the  expected  queueing  delay  which 

can  be  seen  from 

xp2  i x2p2  for  0.ixp<1 

The  next  step  in  the  cascade  of  stochastic 
processes,  which  generates  the  sequence  of  Class  0,  Class  I, 
...  information  policies,  is  to  model  the  time  varying 
parameter  of  the  Poisson  process  as  a stochastic  process. 

(4:2.9)  Choice.  (Model  for the  parameter  of  the  Poisson 

process. ) The  parameter  of  the  Poisson  process  is  assumed 
to  be  a jump  process  which  can  take  only  a finite  number  of 
values  in  the  state  space  (Xi,X2,..  ,XlJ.  Jumps  within  the 
state  space  occur  only  at  equidistant  points  in  time  and 
form  a Markov  chain  with  transition  probabilities  given  by 
the  stochastic  matrix  IX  We  choose  this  model  for  the 
following  reasons:  a)  only  a finite  number  of  states  are 

needed  because  we  cannot  distinguish  between  x and  x+€  with 
statistical  confidence  on  the  basis  of  a limited  number  of 
observations  for  a whole  range  of  6's,  b)  the  assumption  of 
changes  in  the  state  only  at  equidistant  points  in  time  is 
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made  to  simplify  the  estimation  of  the  state  of  the  Jump 
process;  this  limitation  can  be  made  arbitrarily  small  by 
choosing  the  distance  between  the  time-points  sufficiently 
small,  c)  the  Markov  chain  assumption  is  made  because  of 
convenience;  it  corresponds  to  the  assumption  that  the 
probability  of  jumping  to  a given  next  state  is  a function 
only  of  the  present  state. 

(4:2.10)  laaark.  Choice  (4:2.9)  still  leaves  two  parameters 
for  which  a value  has  to  be  chosen,  namely  the  number  of 
elements  in  the  state  3Dace  of  the  jump  process,  L,  and  the 

liO£ lQt&CY.al h&ttfSeQ passible  lumps,  h.  The  choice  of  L 

and  h is  a tradeoff  between  high  precision  of  the  model 
(large  L,  small  h)  on  the  one  hand  and  low  line  bandwidth 
and  node  bandwidth  requirements  (small  L,  large  h)  on  the 
other  hand. 

The  final  decision  in  the  design  process  of  the 
random  variable  that  represents  delay,  is  how  to  model  the 
parameters  of  the  jump  process. 

(4:2.11)  Choice.  (Model Cat the paransfrsra af the  .lump 

process. ) The  matrix  n of  transition  probabilities  of  the 
jump  process  is  considered  constant.  That  is,  it  changes  so 
slowly  that  some  centralized  off-line  procedure  may  be 
Implemented  to  adjust  H,  The  motivation  for  this  choice  is 
that  it  probably  takes  a couple  of  hours  of  data  collection 
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before  one  can  decide  with  statistical  confidence  that  the 
parameters  of  the  jump  process  are  not  what  they  were 
thought  to  be.  In  appendix  4:A  a simple  numerical  example  is 
considered  to  support  this  statement. 


(4:2.12)  Assertion.  (Estimating the state  of  the  lump 

process. ) Let  the  vector  p(k“)  give  the  probability 
distribution  over  the  states  of  the  of  the  jump  process 
{X i , . . . ,X l}  just  before  a possible  jump  at  takes  place. 
The  matrix  of  (constant)  transition  probabilities  II  has  as 
its  (l,m)th  element  the  probability  of  Jumping  to  Xi  given 
that  the  present  state  is  Xm.  Then  by  elementary  probability 
theory  the  probability  distribution  over  {Xi,...,Xl)  just 
after  t^  is  given  by  p(k+)  = IIp(k").  Suppose  that  the 
number  of  arrivals  in  the  queue  in  the  time  interval  t^  to 
t](+ 1 s tjj+h  is  n,  then  by  Bayes  Theorem  the  probability  of 
being  in  state  Xi  Just  before  t=tk+i,  given  the  n arrivals, 
Is: 


where 


Pr{ state  at  t£+i  s x i i n arrivals} 
„Pr{n  arrivals!  ^llpi^k*) 

Pr{n  arrivals) 


Pr{n  arrivals!  X m}  a liLXa).'Vh*« 

nl 


and 


L 

Pr{n  arrivals}  * X ,Pr{n  arrivals!  X-} 

a*  1 

Let  I be  such  that  Pi(k4-1")^pi(k>1“)  for  1*1,..., L,  then  the 
a-posteriori  most  likely  estimate  of  the  state  of  the  Jump 
prooess  just  before  tstfc+i  is  Xj. 


% 
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Up  to  this  point  we  have  modeled  three  classes  of 
parameters  that  describe  the  state  of  the  network;  from 
Class  0 that  gives  the  instantaneous  state  and  changes  vary 
rapidly,  to  Class  II  that  gives  long  term  statistics  about 
the  state  of  the  network  and  can  be  considered  constant.  We 
also  decided  that  the  information  exchanged  between  nodes 
should  be  of  Class  I,  for  reasons  to  be  explained  in  section 
4:3.  Still  more  choices  have  to  be  made,  namely  within  Class 
I we  must  decide  what  sort  of  data  to  send  between  the 
nodes,  whether  or  not  to  perform  the  computation  of  minimum 
delays  in  a distributed  way  (this  affects  the  data  that  is 
exchanged  between  nodes)  and  when  to  send  the  data. 

(4:2.13)  Choice.  (Ssnfllnfl Poisson  parameter  vs.  expected 

delay  values.)  First  we  must  decide  what  data  to  send.  Two 
possibilities  are  the  Poisson  parameter  from  which  can  be 
derived  the  expected  queueing  delay,  or  the  actual  expected 
value  of  the  delay  itself.  The  former  has  the  advantage  that 
it  is  a very  concise  representation,  and  transmitting  it  is 
economical  of  line  bandwidth.  On  the  other  hand,  it  may 
require  a significant  amount  of  node  processing  and  node 
storage  to  convert  this  parameter  into  expected  delay.  The 
node  receiving  the  parameter  Mij)  must  perform  the 
following  calculation  to  obtain  the  expected  delay: 
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(4:2.14)  djj  = Ocij+  — LLii )kji 

where  kjj  and  k^j  are  "constants"  that  depend  on  tjj,  cjj, 
Pij,  rij  (cf.  (4:1.5)).  Under  our  assumptions  of  constant  {3, 
tij»  cij>  Pij»  rij>  all  the  nodes  must  duplicate  the 
performance  of  the  calculation  (4:2.14)  and  have  tables  of 
kjj  and  kjj.  If  the  values  of  tjj,  Cij,  p^j,  rjj  are 
changing  with  time,  additional  communication  is  needed  to 
update  kjj  and  kjj.  Therefore,  we  choose  to  design  the 
information  policy  to  send  the  actual  value  of  the  expected 
delay  in  favor  of  the  Poisson  parameter.  We  note  in  passing 
that  it  may  still  be  possible  to  use  a very  compact 
representation  of  the  delay  in  order  to  minimize  the  use  of 
bandwidth,  at  the  expense  of  increased  node  bandwidth 
requirements.  For  instance,  a floating  point  format  or  other 
simplified  version  of  the  same  concept  can  be  used  since 
there  are  relatively  few  significant  bits  of  information 
about  the  value  of  the  expected  delay,  whereas  there  may  be 
a large  range  of  values. 

(4:2.15)  1— rtr.  The  choice  of  sending  expected  delay  values 
has  the  additional  advantage  that  it  is  now  simple  to 

accommodate  the  case  where  ii j^-Jli J^-CiJ &CS changing 

in  time  if  we  assume  that  they  change  at  a rate  slower  than 
or  equal  to  the  rate  at  which  the  Poisson  parameter  is 

changing.  In  addition  to  the  estimation  of  Xjj  according  to 


s < 
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(4:2.13)  node  i can  perform  (probably  less  sophisticated) 
procedures  for  estimating  tjj,  Cjj,  pjj,  r^j  and  use  those 
values  for  computing  d£j.  And  of  course  variable  packet 
length  can  also  be  included  in  the  computation  of  d]j  (cf. 
remark  (4:2.5)). 

(4:2.16)  Choice.  (Replicated  vs.  distributed  computation a£ 

minimum  delays.)  An  important  design  decision  is  whether 
all  the  nodes  perform  independent,  replicated  computations 
with  the  same  input  data,  or  whether  they  cooperate  in 
distributing  the  computation  among  the  nodes,  and  share  the 
output  data.  In  the  replicated  computation,  after  all  the 
nodes  have  exchanged  their  values  of  the  expected  delay  djj, 
each  node  performs  its  computation  of  the  matrix  of  minimum 
expected  delays  D,  as  indicated  in  definition  (4:1.6)  and 
from  that  finds  the  neighbor  which  lies  on  the  quickest 
path.  Some  simple  calculations  show  that  this  approach 
requires  unrealistically  high  node  bandwidth.  Each  node 
receiving  a new  one-hop  distance  must  perform  a new  quickest 
path  computation  which  is  quadratic  in  the  number  of  nodes, 
and  each  node  receives  such  updates  from  all  other  nodes, 
which  makes  the  process  cubic.  The  computation  of  delay 
vectors  from  one-hop  delay  in  a replicative  fashion  is 
therefore  discarded  in  favor  of  a distributed  approach. 

We  now  present  one  way  of  distributing  the  computation  of 
the  matrix  D among  the  nodes,  which  we  shall  use  in  our 
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comparative  analysis  in' section  4:3.  Let  node  i compute  row 
i of  DPaDUDP-\  and  send  the  result  to  its  neighbors;  this 
constitutes  one  iteration.  Note  that  D1  is  a sparse  matrix: 
row  i has  only  finite  elements  in  columns  Ji,J2**»*tJn  and 
i,  where  Ji,J2»-»*»Jn  are  the  neighbors  of  i (typically  n is 
5 or  less).  This  means  that  in  computing  row  i of  Dp  node  i 
needs  to  have  only  rows  Ji,J2,...,jn  of  Dp-1.  This  is 
exactly  what  node  i received  from  its  neighbors  in  the 
previous  iteration.  The  total  number  of  iterations  needed  is 
equal  to  the  maximum  hop  distance  M in  the  network.  However, 
partial  results  in  the  computation  of  D (these  are  Dp,  p<M) 
can  often  be  used  for  routing  decisions.  For  example,  after 
only  three  iterations,  node  i has  available  to  it  the 
minimum  3-hop  delays  from  its  neighbors  to  all  destinations 
(some  of  the  values  may  be  "infinite"  meaning  that  that 
destination  is  not  reachable  in  3 hops  from  the  neighbor). 
Thus,  the  node  can  decide  for  all  destinations  that  are 
reachable  in  4 hops  or  less  (these  may  account  for  a large 
portion  of  the  traffic)  via  which  neighbor  the  minimum  4-hop 
or  less  delay  is  attained.  This  will  lead  to  a 
sub-optimality  only  in  the  case  that  there  is  a path  of  more 
than  4 hops  to  some  node  which  has  lower  delay  than  the 
4-hop  or  less  path.  It  is  assumed  that  the  nodes  determine 
their  estimates  of  expected  1-hop  delay  to  their  neighbors 
Just  before  a new  sequence  of  M iterations  is  started  and  do 


not  alter  this  value  until  the  next  sequence  of  M 
iterations.  This  is  to  help  prevent  the  situation  in  which 
node  i thinks  that  the  quickest  path  to  destination  k is  via 
node  j,  and  node  j thinks  that  the  the  quickest  path  to 
destination  k is  via  node  i. 

With  this  model  of  the  distributed  computation  we  have  also 
determined  the  form  of  the  data  that  is  exchanged  between 
the  nodes:  row-vectors  of  minimum  p-hop  expected  delays  of 
one  node  to  all  destination. 

(4:2.17)  Kaairk.  The  ARPA network  scheme  of  distributed 

computation  differs  from  the  scheme  given  in  (4:2.16).  In 
the  ARPA  Network  scheme,  node  i send  to  its  neighbors  a 
row-vector  of  minimum  delays  from  i to  all  destinations  and 
gets  the  same  information  from  its  neighbors.  Node  i then 
recomputes  its  new  row  vector  of  minimum  delay  from  its 
present  values  of  one  hop  delays  to  its  neighbors  and  the 
information  just  received.  This  scheme  also  converges  to 
the  actual  minimum  delays  when  all  the  one-hop  delays  remain 
constant,  but  in  an  undetermined  number  of  iterations.  In 
practice,  of  course,  the  nodes  must  update  their  estimates 
of  the  one-hop  delays  from  time  to  time,  sometimes  within  a 
sequence  of  update  iterations.  This  can  in  some  cases  lead 
to  the  ping-ponging  alluded  to  in  (4:2.16),  where  node  i 
sends  traffic  for  k via  J and  j sends  traffic  for  k via  i. 
There  are  ad  hoc  solutions  to  prevent  this  ping-ponging  but 
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then  it  is  no  longer  simple  to  analyse  how  up-to-date  the 
routing  information  is.  We  have  therefore  chosen  the  scheme 
in  (4:2.16)  for  further  analysis  in  section  4:3. 

The  final  choice  we  have  to  make  determines  the 
frequency  with  which  the  nodes  exchange  control  data. 

(4:2.16)  Choice.  (When  to  exchange data..)  A sequence  of 

iterations  as  described  in  (4:2.16)  can  be  started 
periodically  without  regard  to  changes  in  the  one-hop  delays 
or  solely  in  response  to  such  changes,  or  a combination  of 
both.  The  last  approach  has  the  merits  of  both;  when 
information  changes,  the  local  node  can  start  a new  sequence 
of  iterations.  Some  aspects  of  periodic  updating  must  be 
maintained  to  ensure  that  the  process  does  not  occur  with 
too  low  a frequency  (bad  for  reliability)  or  too  high  a 
frequency  (bad  for  line  bandwidth).  In  fact,  in  a large 
network,  it  is  not  possible  to  have  the  ideal  of  event 
triggered  routing  information  flooding  the  network,  since 
too  many  new  events  occur  and  the  paths  are  so  long.  Then 
this  approach  degenerates  to  periodic  updating. 

(4:2.19)  Smmmrj.  In  this  section  we  have  presented  a set  of 
design  choices  leading  to  a fully  specified  routing 
algorithm  of  Class  I.  As  shown  in  figure  4:2. a these  choices 
can  be  seen  as  defining  a particular  algorithm  which  can 
then  be  analysed  and  compared  against  other  methods.  For  the 
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purposes  of  a comparative  analysis  in  section  4:3,  we  define 
an  analogous  set  of  decisions  for  Class  0 and  Class  II,  as 
illustrated  in  the  figure.  It  is  these  three  algorithms,  the 
results  of  choices  shown,  which  are  analysed  in  section  4:3. 


CLASS  0, 1,  n 
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M;3  Comparison  of  Classes  of  Information  Policies. 

In  this  section  we  shall  compare  the  performance 
of  the  minimum  delay  decision  rule  when  it  is  either  based 
on  the  actual  (but  possibly  delayed)  value  of  the  delay 
(Class  0)  or  on  the  expected  (possibly  delayed)  value  of  the 
delay  (Class  I)  or  on  the  long  term  average  of  the  expected 
delay  (Class  II)  in  order  to  support  our  general  contention 
that  Class  I is  superior  to  Class  0 and  Class  II. 

We  shall  compare  the  different  classes  by  considering  for  a 
typical  packet  that  has  to  travel  from  (source)  node  i to 
(destination)  node  k,  the  relative  magnitude  of  the  expected 
delay  in  each  of  the  three  cases.  After  that,  we  shall 
include  the  additional  delay  caused  by  control  information 
exchanged  between  the  nodes,  according  to  the  choices  made 
at  the  end  of  the  previous  section.  Finally,  graphs  will  be 
presented  ^jhat  give  delay  vs.  time  between  updates  of  the 
node  delay  tables  for  each  of  the  three  information 
policies.  To  compute  the  graphs  we  used  expressions  given 
below  with  values  that  are  representative  for  networks  like 
the  ARPA  Network. 

(4:3*1)  expressions.  In  the  case  of  Class  II  information, 
the  long  term  expected  delay  for  a typical  packet  that  can 
travel  from  node  i to  node  k along  alternative  routes 
1 , . . . , R is 


(4:3.2)Eji  = r,_1nun  nllong  term  expected  delay  along  route  r} 

" • 1 | • • • | It 

<3 

For  Class  I and  Class  0 we  have  the  following  expressions 
<*:3-3)  Ej  = PWj+d-P)#! 

(%:3.*>  E0  = PW0+(1-P)W0 

P is  the  probability  that  the  expected  values  of  the  delays 
along  the  alternative  routes  have  not  changed  in  the  time 
interval  between  collecting  data  for  the  routing  decision 
and  transmitting  the  packet.  In  other  words,  P is  the 
probability  that  the  traffic  pattern  has  not  changed  between 
the  computation  of  the  node  delay  table  and  the  time  that 
the  packet  travels  from  node  i to  node  k.  W^  and  Wo  are  the 
long  term  expected  delays  under  Class  I and  Class  0 
information  given  that  the  traffic  pattern  has  not  changed. 

Wj  and  Wq  are  the  corresponding  quantities,  given  that  the 
traffic  pattern  has  changed.  The  quantities  Wj,  Wj,  Wq,  Wq 
depend  on  the  probability  distributions  of  the  expected 
delay  along  the  different  routes  and  the  (conditional) 
probability  distributions  of  the  actual  delay  given  the 
expected  delay.  In  Appendix  4:B  expressions  are  derived  for 
those  quantities  in  terms  of  the  probability  densities. 

We  still  need  to  assess  the  probability  P that  the  values  of 
the  expected  delays  are  still  the  same  at  the  time  of 
transmission  of  the  packet.  Let  T be  the  average  time 
between  changes  of  the  values  of  the  expected  delays,  and 
suppose  the  time  between  changes  has  a negative  exponential 
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distribution.  Further  let  U be  the  time  between  the  starts 
of  updates  (the  start  of  an  update  is  considered  to  coincide 
with  the  time  of  collecting  data),  and  let  V be  the  time 
between  the  start  of  an  update  and  the  time  the  data  are 
available.  The  time  elapsed  since  the  last  change,  t,  has 
an  negative  exponential  distribution  with  parameter  T.  The 
time  since  the  last  update  that  started  longer  then  V ago, 
u,  has  an  uniform  distribution  on  (V,U+V).  The  probability  P 
corresponds  to  the  probability  that  t>u.  For  given  u this  is 

Pr{t>u!  u}  = le~t/Td 

integrating  over  the  uniform  distribution  gives 

«:3.5>  p ■ j/v+V/”  f®”t/T<)t  d 

_T / ev/T  -(U+V)/T. 

"u  ”e 

(4:3.6)  ■•■■rk.  Our  ahttlae.  of  .ln£Qrmat.lo.n aiaaa I can  be 

"proved"  to  be  optimal  if  we  can  show  that  EjiEo  and  ExIEji 
for  a typical  large  packet  switched  computer  network  (see 
(4:3.9)  for  what  is  meant  by  "typical  large").  One  can  see 
from  (4:3-2)— (4:3-4)  that  in  order  to  prove  optimality  in 
general  it  would  be  sufficient  to  show  that  a)  W(I)iEjx,  b) 
Wi^Wq,  c)  WxiEn,  and  d)  Wi^Wq  if  we  do  not  consider  the 
additional  delay  due  to  control  information.  Let  us  admit 
here  that  it  is  not  possible  to  show  that  in  general  a),  b), 
c),  and  d)  are  true.  It  can  be  shown  that  a)  and  b)  are  true 
in  general  but  d)  may  be  not  true  for  some  pathological 
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choices  of  the  probability  distribution  for  the  expected 
delay  and  for  the  actual  delay  given  the  expected  delay. 
Moreover,  c)  is  in  general  not  true.  Therefore  additional 
conditions  on  the  probability  densities,  and  conditions  on 
the  value  of  P that  ensure  what  is  gained  in  b)  has  more 
weight  than  what  Am  lost  in  d)  are  needed.  Rather  than  to 
try  to  find  these  precise  conditions  we  will  determine 
typical  values  of  Eq,  Ej,  Eji,  that  are  adjusted  to  include 
additional  delay  due  to  routing  information  and  from  which 
we  can  also  infer  how  auoh  the  differences  in  expected  delay 
are. 

The  final  factor  we  have  to  take  in  account  is  the 
increase  in  delay  experienced  by  data  packets  due  to 
communication  between  nodes  of  delay  values  (control 
information).  In  a Class  II  Information  scheme,  the  delay 
values  can  be  considered  constant  which  means  that  there  is 
no  Increase  because  of  control  information.  For  Class  I and 
Class  0 information  schemes,  we  assume  that  the  delay  values 
are  exchanged  between  nodes  by  means  of  the  decentralized 
• computation  of  minimum  delays  according  to  (4:2.16). 

Although  (4:2.16)  was  formulated  in  terms  of  Class  I 
information,  it  also  holds  for  Class  0 information  merely  by 
reading  actual  delay  for  expected  delay. 
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(4:3.7)  Expression.  (Increase in dala* ins La control 

Information. ) Let  U be  the  time  between  successive  series 

of  iterations  according  to  (4:2.16).  One  iteration  consists 

of  the  comparison  of  the  minimum  p-hop  delay  vectors  of  its 

neighbors  and  the  exchange  of  the  new  computed  (p+1)-hop 

delay  vectors.  Each  vector  travels  as  a packet  of  v bits. 

Let  M be  the  total  number  of  iterations  per  computation  of 

minimum  delays  for  the  whole  network.  Then  the  average 

number  of  packets  of  control  information  per  second  that 

travels  on  each  link  is  ^ packets  per  second,  which  implies 

a capacity  reduction  on  that  link  of  bits/sec.  This 

implies  an  increase  in  queueing  delay,  which  would  otherwise 
2 

be  — -JL— _.  The  service  time  8 is  related  to  the  line 

2(1-  p)  r 

capacity  C by  j3=_  where  b is  the  number  of  bits  in  a data 

w 

packet.  A little  bit  of  calculus  with  the  three  expressions 
just  mentioned  gives  for  the  relative  increase  of  queueing 
delay 

(4:3.8)  = C*-CdC 

d Cp-C<iCr 

where 

C = line  capacity 
M v 

Cr=  C-__  = reduced  line  capacity 

C<j=  b = capacity  needed  for  data  packets. 

Of  course  the  factor  should  only  multiply  the  portion  of  the 
delay  which  is  in  excess  of  the  fixed  delay  Hi  fix, 
respectively  Wq  fix,  where  the  fixed  delay  is  the  delay 
along  the  path  when  all  queues  were  empty. 
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Now  typical  values  must  be  chosen  for  the 
parameters  that  appear  in  (4:3.2)— (4:3.5)  and  (4:3.8)  before 
we  can  draw  representative  graphs  that  give  long  term 
expected  delay  versus  frequency  of  updating,  1/U,  for  each 
of  the  three  information  policies. 

(4:3.9)  Values.  (Typical  values  for  a large network. ) For 

the  long  term  expected  delay  we  distinguish  between 
information  class  (II,  I,  0),  and  whether  the  traffic 
pattern  (t.p.)  has  changed  or  not.  Also  for  Class  I and  0 
that  portion  of  the  delay  that  is  fixed  (i.e.  when  all 
queues  are  empty)  is  given. 

Eh  s 447  msec  : Class  II 
Wi  = 350  msec  : Class  I,  t.p.  not  changed 

®I  = 477  msec  : Class  I,  t.p.  changed 

Wj  fix  s 128  msec  : Class  I,  fixed  portion 
Wo  * 376  msec  : Class  0,  t.p.  not  changed 

Wq  s 48 1 msec  : Class  0,  t.p.  changed 

Wo  fix  = 134  msec  : Class  0,  fixed  portion 
The  parameters  above  are  determined  by  the  choices  of 
probability  densities,  according  to  the  formulas  given  in 
appendix  4:B.  The  probability  densities  that  were  actually 
used  in  the  computation  are  given  in  figures 
(4:B.a)-(4:B.c) . They  were  inspired  by  graphs  for  the 
distribution  of  delay  of  messages  in  the  ARPA  Network  from 
the  Network  Measurement  Centre. 


Further  we  have 


T = 10  sec 


: average  time  between  changes  in  t.p. 


V s max{il^_u, Vmin}  : time  between  start  of  update  and 
M 

availability  of  new  routing  data. 
The  expression  for  V is  explained 


H s 5 hops 


M = 10  hops 


below. 


: number  of  hops  in  typical  path 
between  source  node  and  destination 
node. 

: maximum  hop  distance  in  the 
network 


The  number  of  iterations  in  an  update  of  the  node  delay 
tables  is  M but  after  H-1  iterations  delay  values  to 
destinations  that  are  H or  less  hops  away  may  be  used. 
Assuming  that  iterations  are  equally  spaced  in  the  interval 
between  updates  we  find  V=  ^^-U.  However  there  is  a minimum 
amount  of  time  that  is  required  for  one  iteration  resulting 


vmin  s .2  sec 


: time  for  4 iterations 


This  conservative  estimate  (i.e.  the  maximum  of  the  minimum 
time  for  4 iterations)  is  obtained  as  follows:  The  time  for 
one  iteration  is  the  maximum  time  it  takes  to  get  the  delay 
vector  to  the  neighbor  node,  this  is  .02  sec  waiting  for  the 
current  packet  of  1000  bits  going  out  § 50K  bits/sec,  .022 
sec  for  the  delay  vector  of  600  bits  § 50K  bits/sec  ♦ 3000 
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kilometer  # 300,000  km/sec  (speed  of  light)  plus  .001  sec 
processing  time:  180  comparisons  and  additions  (n=3 

neighbors  per  node,  N=60  nodes)  § 5 /usee.  Hence  Vmin  = 
4-(.02+.022+.001)  = .2  sec. 

C s 50K  bits/sec  : line  capacity 

Cr  =50-£j(  bits/sec  : reduced  line  capacity  because  of 

routing  information  flowing 

C<j  s 25K  bits/sec  : capacity  needed  for  data  packets 

With  the  given  values  we  can  compute  the  graphs 
below,  of  which  the  last  one  summarizes  the  long  term 


expected  delay  as  a function  of  the  frequency  of  updating, 
for  each  of  the  three  information  policies. 


hr 


p 


Figure  4:3. a.  Delay  vs.  P.  P is  the  probability  that  the 
traffic  pattern  has  not  changed  since  the  last  update.  The 
long  term  expected  delay  for  a typical  packet  here,  does  not 
include  delay  due  to  routing  information. 


jj  (UPDATES  PER  SEC) 


Figure  4:3.c.  BalatlVO  increase  In  delay  vs.  frequency  of 
updating.  The  relative  increase  in  delay  (d+&l)/d  gives  the 
factor  by  which  the  queueing  delays  increase  due  to  routing 
information  flowing  through  the  network. 


114 


► 


(4:3*10)  Heaark.  Figure  4:3. d gives  a slightly  conservative 

view  of  the  liffergric.^g  jr\  iQQK  Ur.i5 e^pej^ted delay lender 

Class  II.  Class  I and  Class  0 information. The  graphs  were 

calculated  under  the  assumption  that  the  probability  » 

distributions  of  the  delays  were  the  same,  whether  the 

information  policy  is  Class  II,  I or  0 . However,  we  may  now 
conclude  that  using  Class  0 information  instead  of  Class  I 
information  would  increase  the  probability  of  having  a 
larger  delay  which  would  make  the  differences  Wj-Wo  and 
^I-Wq  larger.  Including  this  effect  would  enhance  the 
difference  between  the  graphs  of  Class  I and  Class  0 
information.  In  the  same  way  the  graph  for  class  II 
information  would  lie  at  a higher  point  in  figure  4 : 3 . d if 
the  effect  of  worse  probability  distributions  of  the  delays 
were  included. 
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*t:A  Appendix.  Parametgra_of -thecas  . Prases^ 

The  time  between  jumps  of  the  parameter  of  the 

Poisson  process  is  typically  a few  seconds,  say  4 seconds. 

Suppose  the  estimated  parameters  of  the  jump  process  H^st 

are  close  to  the  true  parameters  I^true*  say 

L 

'^est  ^ ^ * J ) “^true  ^ ^ » J ) ■ ^ ^ j = where  L is  the 

number  of  spates  of  the  jump  process.  Then  the  amount  of 
data  that  has  to  be  collected  to  be  able  to  reject  the 
hypothesis  Itn^3t  in  favor  of  the  hypothesis  rfcITtrue  with  a 
.05  probability  of  rejecting  inappropriately  is  typically 
about  300  or  more  per  state  of  the  jump  process.  If  we 

assume  the  number  of  states  L=5,  then  this  amounts  to 
collecting  data  at  least  for  5x300xM=  6,000  sec  =1.6  hours 

which  makes  adjusting  the  values  of  n well  suited  for  a 

centralized  off  line  procedure. 


# 
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4:B  Appendix.  Derivation  of  the  Long Tftr.Hl — Eslai 

with  Class  0.  I and  II  Information. 

A typical  packet  that  has  to  travel  from  node  i to 
node  k can  do  so  along  alternative  routes  1,...,R.  Which 
alternative  is  chosen  depends  on  the  class  of  information 
policy  and  on  the  actual  values  of  the  information.  As  a 
direct  consequence  of  the  choices  of  models  made  in  section 
4:2,  we  can  model  the  delay  along  each  of  the  alternative 
routes  in  the  following  way.  The  expected  delay  along  Hr 
along  route  r is  a sample  of  the  probability  density  gr(»). 
The  actual  delay  dr  along  route  r given  the  expected  delay 
dr=t  is  a sample  of  the  probability  density  f r ( • 1 t ) . From 
this  it  follows  that  the  unconditional  delay  along  route  r 
is  a sample  from  the  probability  density  fr(»)  where  ?r(*) 
is  defined  by 

?r(s)  = fr(sit)gr(t)  dt  r=  1 , . . . , R 

The  long  term  expected  delay  along  route  r is  then  given  by 

J q sfp(s)  ds  r=  1 , . . . ,R 

In  figures  4:B.a-4:B.c,  typical  probability  densities  gr(*), 
fp(*|t)  for  some  sample  values  t and  fr(*)  are  plotted.  They 
represent  the  probability  densities  that  were  used  in 
computing  the  values  (4:3.9)  on  which  the  graphs  at  the  end 
of  section  4:3  were  based. 
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Figure  4:B.a.  Probability  densities  of  expected  delay 
A set  of  independent  samples  from  these  densities 
constitutes  a traffic  pattern.  According  to  our  model  the 
time  between  taking  successive  sets  of  samples,  i.e.  the 
time  between  changes  of  traffic  pattern,  has  a negative 
exponential  distribution  with  mean  T. 
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We  assume  that  the  routes  are  numbered  in  such  a way  that 
• *i3R.  At  this  point  we  can  already  give  the 
expression  for  the  long  term  expected  delay  when  the  minimum 
delay  decision  rule  is  based  on  Class  II  information  (i.e.  » 

fixed  routing).  Namely,  under  this  scheme,  route  1 is  chosen 
invariably  which  gives 
C*:».l)  En  = 

To  get  expressions  for  the  long  term  expected  delay  under 
Class  I and  Class  0 information,  we  have  to  distinguish  two 
cases.  In  the  first  case,  the  traffic  pattern  has  not 
changed  in  the  time  interval  between  collecting  data  and 
transmitting  the  packet,  which  means  that  the  actual  delays 
are  samples  from  fr(-|tr),  r=1,...,R,  where  at  the  time  the 
data  for  the  routing  decision  was  collected  we  had  ?rstr, 
rs1,...,R.  In  the  second  case,  the  traffic  pattern  has 
changed,  which  means  that  the  actual  delays  are  samples  from 
fr(*!tf),  r*  1,...,R,  where  M,t£,...,tft  are  independent 
(from  the  values  of  3r  at  the  time  data  was  collected) 
samples  from  gP(*),  rsi,...,R,  or  equivalently  the  actual 

delays  are  samples  from  ?r(«). 

Let  rj  denote  the  route  chosen  by  applying  the  minimum  delay 
decision  rule  to  Class  I information:  t 

<*'»-«  'I  * V?.,Rltr>> 

where  tr  is  the  expected  delay  of  route  r at  the  time  that 
data  is  collected.  Let  ro  be  the  route  chosen  with  Class  0 
information,  i.e., 
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(M.3)  rI  = ar«'r.W,B<»r>> 

where  sr  is  a sample  from  fr(*!tp),  r=1,...,R.  This  sample 
is  assumed  to  be  always  independent  of  the  actual  delay  the 
packet  experiences  going  from  i to  k,  representing  the  fact 
that  the  actual  delay  is  very  rapidly  changing  compared  to 
the  delay  in  information. 

In  the  first  case  (when  expected  delays  have  not  changed) 
with  Class  I information,  the  long  term  expected  delay  given 
rj=r  weighted  with  the  probability  that  ri=r  is 

tgr(t)^(i-Gp(t))  dt 

where  Gp(-)  is  the  distribution  function  corresponding  to 
8p(*).  This  result  (and  also  later  results  of  the  same 
character)  is  easily  understood  by  noting  that  gr(t)dt 
represents  the  probability  that  a sample  of  gr(*)  has  the 
value  t and  that  (1-Gp(t))  is  the  probability  that  a sample 
from  gp( • ) has  a value  >t.  Summing  (4:B.4)  over  all 
possible  values  of  r,  we  get  the  long  term  expected  delay 
under  Class  I information,  given  that  the  traffic  pattern 
has  not  changed. 

R *oo 

C«:B.5)  Wj  * J0  tgr(t)  X^(l-Gp(t))  dt 

When  expected  delays  have  not  changed  and  we  have  Class  0 
information,  the  probability  that  ro=r  given  that  the 
expected  delay  along  route  r is 
C*:B.f)  fr(®lt)  n(1-Fp(a))  ds 


22 

where  Pp( - ) is  the  distribution  function  corresponding  to 
?p(’).  Similar  to  (4:B.5),  the  long  term  expected  delay 
under  Class  0 information  given  that  the  traffic  pattern  has 
not  changed  is 

p 

C^:B.7)H0=rZi  tgr(t)  fr(s!t)prjni-Fp(s))  ds  dt 

Now  we  consider  the  second  case  where  the  expected  values  of 
the  delays  along  the  alternative  routes  at  the  time  of 
transmission  of  the  packet  are  independent  of  the  values  of 
the  expected  delays  at  the  time  data  was  collected  for  the 
routing  decision.  In  this  case,  the  long  term  expected  delay 
given  that  route  r is  chosen  is  simply  dr,  for  both  Class  I 
and  Class  0 information.  The  probability  that  route  r is 
chosen  is 

8r(t)p^(  1-Gp(  t) ) dt 

for  class  I,  and 

J?  MOpjm-FpU))  dt 

for  Class  0.  Therefore,  the  long  term  expected  delay  under 
Class  I Information  given  that  the  traffic  pattern  has 
changed  is 

p 

<’=»•»>  gr(t)pIjtr(1-Gp(t))  dt 

and  the  corresponding  quantity  for  class  0 information  is 
8o=r|,  arJ“  ? r(  t)pip^(  l-Pp(t) ) dt 

The  probability  densities  ?r(*),  r*1,...,R,  also  determine 
what  portion  of  the  delay  is  the  fixed  delay.  By  fixed  delay 
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we  aean  the  delay  along  a route  when  all  queues  are  empty. 
According  to  the  assumption  that  all  variation  in  delay  is  a 
consequence  of  variations  in  queue  length  (see  remark 
(4:1.4))  it  follows  that  the  fixed  delay  along  route  r is 
dr  fix  * ■»*(  d Mdi  Pr(d’)sO  Vd'id) 

The  fixed  portion  of  the  delay  under  Class  I or  class  0 
information  is  then  the  sum  of  the  fixed  delays  along  each 
of  the  routes,  weighted  with  the  probability  that  that  route 
is  chosen.  Therefore  we  get  similar  to  (4:B.8)  and  (4:B.9) 

(«:B.10)  »i  fix=p|1  d r fix/®  8p(t)p5Cr(  1-Gp( t) ) dt 
and 

(%:B.11)  W0  d r fix /?  ?r(t)pyp(  1-Pp(t) ) dt 
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^ In  the  decentralized  control  of  computer 
communication  it  is  appropriate  to  consider  the  controllers 
as  having  the  same  objective:  the  efficient  operations  of 
the  communication  medium.  In  this  sense  decentralized 
control  of  computer  communication  fits  in  the  area  of 
multi-person  decision  making  that  is  designated  as  team 


theory.  Section  2:1  was  devoted  to  aspects  of  team  theory, 
specifically  to  the  introduction  of  the  notions  of  symmetric 
team  problems  and  symmetric  solutions.  For  the  purpose  of 
introduction  and  immediate  application  it  was  sufficient  to 
consider  only  finite  team  problems  in  strategic  form.  For 
the  purpose  of  making  the  theory  more  complete  it  is 
important  to  extend  the  definition  and  analysis  of  symmetric 
team  problems  and  symmetric  solutions  to  team  problems  with 
an  infinite  number  of  possible  strategies  and  to  consider 


the  problem  in  extensive  form. 


,V 


Finding  applications  \jf  the  notions  of  symmetric 
team  problems  and  symmetric  solutions  is  not  hard.  By  way  of 
illustration  we  considered  the  corridor  problem  in  section 


*)  In  a team  problem  in  strategic  form,  each  decision  maker 
chooses  from  his  set  of  possible  strategy-maps,  a map  which 
maps  his  (sequence  of)  information  into  his  (sequence  of) 
decisions.  In  extensive  form  a decision  is  determined  for 
each  decision  maker  at  each  point  in  time,  given  his 
information  at  that  point  in  time  [6]. 


2:1,  and  later  in  section  2:2  the  access  problem  in  wire 
communication  was  considered.  The  unrestricted  and  symmetric 
solutions  to  the  access  problem  correspond  to  two  different 
information  structures:  in  the  unrestricted  case  the 
information  of  what  number  is  assigned  to  a station  can  be 
used  to  base  decisions  on,  in  the  symmetric  case  it  can  not 
be  used.  A third  information  structure  is  considered  for 
comparison  purposes.  It  can  be  concluded  that  the 
information  structure  is  an  important  parameter  in  finding 
and  explaining  different  solutions  to  problems  in 
decentralized  control. 

The  multi-access  satellite  channel,  which  was 
considered  in  chapter  3»  has  as  its  characteristic  feature 
that  the  ground  stations  can  only  share  information  with 
considerable  delay.  The  model  was  chosen  such  that  with  some 
light  assumptions  it  could  be  proven  that  the  delayed 
information  was  as  good  as  no  information  at  all,  or  in 
other  words,  the  optimal  control  law  could  be  open-loop.  In 
general  for  multi-person  decision  problem  with  delayed 
sharing  of  information  such  a strong  result  can  not  be 
proved.  However  it  may  be  feasible  and  worthwhile  to  show 
that  in  general  as  the  delay  gets  larger  the  shared 
information  gets  less  valuable  in  terms  of  the  objective 
function.  From  the  result  which  states  that  the  optimal 
control  law  is  open  loop,  it  should  not  be  concluded  that  in 


•»  # 


126 

the  operation  of  the  satellite  channel  there  is  no  need  for 
feedback  at  all.  The  result  only  means  that  feedback  should 
not  be  at  the  level  of  delayed  reported  values  of  the  state. 
However  when  parameters,  such  as  the  arrival  rate  of  packets 
at  the  stations  p,  are  slowly  (compared  with  the  delay) 
changing,  then  feedback  can  be  at  the  level  of  changing  the 
open-loop  control  law  whenever  the  current  value  of  p is 
updated. 

The  suggested  feedback  in  the  preceding  paragraph, 
corresponds  directly  to  the  main  conclusion  of  chapter  4.  In 
this  chapter  on  information  policies  for  routing  in  a 
computer  network,  different  classes  of  information  were 
defined  on  which  a given  decision  rule  could  be  based.  Class 
0 corresponds  to  delayed  sharing  of  the  rapidly  changing 
value  of  the  state  of  the  network,  Class  I to  the  delayed 
sharing  of  the  expected  value  of  the  state  of  the  network, 
which  is  slowly  changing  and  Class  II  corresponds  to  sharing 
a constant:  the  long  term  expected  state  of  the  network.  The 
conclusion  was  that  the  decision  rule  should  be  based  on 
Class  I information,  i.e.  feedback  on  the  level  of  a slowly 
changing  parameter.  The  definition  of  different  classes  of 
information  policies  was  based  on  modeling  a stochastic 
process  (here'  the  value  of  the  delay  along  a link)  as  a 
cascade  of  stochastic  processes.  At  each  next  level  of  the 
cascade  the  (expeoted)  time  between  changes  of  the  value  of 
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the  stochastic  process  is  larger  than  at  the  previous  level. 
In  chapter  4 this  method  of  constructing  different  classes 
of  information  policies  was  plausible  in  view  of  the 
j specific  application,  but  the  validity  of  this  method  is 

probably  much  more  general.  Therefore  it  would  be  useful  to 
find  generalizations  that  are  well  founded  on  principles  of 
stochastic  processes  and  mathematical  statistics. 

The  conclusion  of  the  conclusions  could  be  stated 

as: 

In  decision  making,  particularly  in 
decentralized  decision  making,  the 
question  what  is  the  information  on 
which  decisions  are  based  is  of  prime 
importance.  Next  is  the  question  what 
decisions  are  made. 
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